REMARKS 

In the Final Office Action dated May 30, 2008, claims 1-8, 13 and 16-83 are pending. 
Claims 1, 5, 13, 16, 32-33 and 83 are under examination. Claims 2-4, 6-8, 17-31 and 34-82 are 
withdrawn from consideration. Claims 16, 32-33 and 83 are objected to allegedly because these 
claims recite nonelected subject matter in the alternative. Claims 1, 5, 13, 32-33 and 83 are 
rejected under 35 U.S.C. § 1 12, first paragraph for allegedly not being enabled by the specification. 
Claims 1,5, 13,32-33 and 83 are rejected under 35 U.S.C. § 112, first paragraph, for allegedly not 
satisfying the written description requirement. The Examiner also states that Applicants' 
Information Disclosure Statement submitted on April 29, 2008 has been considered, but no art 
rejection is raised because the Examiner is unable to establish a relationship between instant SEQ 
ID NO: 7 and the sequence referred to by Mack et aL (GenBank AB033025), 

This Response addresses each of the Examinees rejections and objections. Applicants 
therefore respectfully submit that the present application is in condition for allowance. Favorable 
consideration of all pending claims is therefore respectfully requested. 

Amendments to Claims 

Independent claims 1 and 16 have been amended to further define the biological sample 
as a "blood, serum, stool or gastrointestinal tract sample". Support for this amendment is found in 
the specification, e.g., page 24, lines 14-26. 

Independent claims 1 and 16 have also been amended to delete the term "substantially" 
in reference to sequence identifiers. Further, claim 1 has been amended to delete the expression 
"predisposition to the onset". 

No new matter is introduced by the foregoing amendments, 
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Examiner's Remarks Regarding KIAA 11 99 

Before responding to the Examiner's rejections and objections, Applicants wish to first 
address the Examiner's remarks with respect to KIAA1 199, Specifically, the Examiner states that 
she is unable to establish a relationship between instant SEQ ID NO: 7 and the KIAA1 199 
sequence referred to by Mack et al. See page 2, Section 4 of the Office Action. On page 16 of the 
Action, the Examiner further state that there is no record in the prior art relating to KIAA1 199 or 
Genbank Accession No. NO000015. The Examiner indicates that she attempted to blast the 
KIAA1 199 mRNA sequence against SEQ ID NO:7 and obtained no similarities. 

Applicants respectfully submit that it is a matter of routine procedure to conduct a 
BLAST search using SEQ ID NO: 7 as the query sequence. The results of such a BLAST search 
are in fact extremely clean in that hits are only obtained with respect to KIAA1 199, with the next 
closest hits exhibiting less than 4% query coverage. Applicants are providing herewith as Exhibit 
1, a document which summarizes the BLAST results of SEQ ID NO: 7 and which demonstrates 
that the sequence which one obtains is the KIAA1 199 sequence. SEQ ID NO: 7 aligns to the map 
region 51,882,643-51,915,746 on chromosome 15. 

Objection to Claims 

Claims 16, 32-33 and 83 are objected to allegedly because these claims recite 
nonelected subject matter in the alternative. 

Applicants respectfully disagree with the Examiner. Independent claim 16 is directed 
to detection of co-expression of two or more nucleic acid molecules, at least one of which is the 
elected nucleic acid molecule comprising SEQ ID NO: 7. That is, the elected nucleic acid 
molecule which comprises SEQ ID NO: 7 is analyzed together with any one or more of the other 
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SEQ ID NOs. recited in subparagraph (i). Assuming that the use of SEQ ID NO:7 is ultimately 
held to be patentable, the method of claim 16 based on the combined use of SEQ ID NO: 7 with 
one or more other nucleic acids should also be found patentable. It is respectfully submitted that 
the claims, as presently recited, properly reflect the elected subject matter (SEQ ID NO: 7). 

Accordingly, the objection to the claims is overcome and withdrawal thereof is 
respectfully requested. 

35 U.S.G § 112, First Paragraph ~~ Enablement 

Claims 1, 5, 13, 32-33 and 83 are rejected under 35 U.S.C, § 1 12, first paragraph, for 
allegedly not being enabled by the specification. With respect to the Examiner's analysis based on 
the In re Wands factors, Applicants respectfully submit the following and maintain that those 
skilled in the art would be able to practice the methods, as presently claimed, without undue 
experimentation. 

Nature and Breadth of the Invention 

The Examiner has noted that claim 1 continues to recite the language "or a 
predisposition to the onset" of an adenoma. This language has been deleted from claim 1 by way 
of the instant amendment. As amended, claim 1 is directed to a method of determining the onset 
of a colorectal adenoma. 

The Examiner has also interpreted the language of "measuring the level of expression" 
of a nucleic acid molecule as including the measurement of transcription or translation of a nucleic 
acid molecule. Applicants submit that the Examiner's interpretation is correct in this regard. 
Clearly, methods for measuring transcription and translation of nucleic acid molecules are well 
known and have been widely used for many years. 
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With respect to the claim language referencing nucleic acid molecules hybridizing to 
SEQ ID NO: 7 under high stringency conditions, the Examiner asserts that many nucleic acid 
molecules would hybridize to SEQ ID NO: 7 under these conditions, including homologs or variant 
of SEQ ID NO: 7. However, the Examiner does not appear to have appreciated that high 
stringency conditions, by their very nature, permit hybridization of molecules exhibiting only very 
high levels of sequence identity with SEQ ID NO: 7. Accordingly, those skilled in the art would 
understand that the claims would not encompass a wide class of molecules unrelated to SEQ ID 
NO: 7. Further, under the law, Applicants are not required to limit the claims to those specifically 
exemplified embodiments; but rather, are permitted to have a breadth of claimed subject matter 
which is consistent with the disclosure of the specification. In re Anderson , 176 USPQ 331, 333 
(CCPA 1 973). Based on the guidance provided in the specification, those skilled in the art would 
be able to identify nucleic acid molecules capable of hybridizing to SEQ ID NO: 7 under high 
stringency conditions, and to further use the identified molecules in the claimed methods, without 
undue experimentation. Applicant's position in this regard is further supported by the discussion 
in relation to KIAA1 199 below. 

The Examiner continues to assert that the claims encompass within their scope 
"homologues, variants or the like" of SEQ ID NO: 7. However, this text was already deleted from 
the claims in response to the previous Office Action. As submitted above, while the claims do 
encompass related hybridizing molecules, such molecules must share a high level of identity with 
SEQ ID NO: 7. 

With regard to the Examiner's rejection of the breadth of the term "biological sample", 
Applicants have amended the claims to define the sample as a blood, serum, stool or 
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gastrointestinal tract sample. Applicants respectfully submit that the specification specifically 
discloses that change of expression can be detected in a blood, serum, stool or gastrointestinal tract 
sample. See, e.g., page 24, lines 14-26 of the specification. Given that a change in expression is 
observed and has been specifically demonstrated in adenoma tissue (see, e.g., pages 1 00-1 07 of the 
specification), those skilled in the art would reasonably expect that the change would also be found 
in the stool, since the adenoma cells are shed into the stool. In addition, those skilled in the art 
would also reasonably expect that the change in expression could be detected in the blood and 
serum. As support of Applicant's position in this regard, it is submitted that it has been 
documented in the art that change in marker levels of colorectal neoplastic tissue could be detected 
in the blood, as reported by, e.g., Park, Oncology 22: 147 (2008) "Biomarker CCA-2 may provide 
accurate blood test for colorectal cancer", and Walgenbach-Brunagel et ah, Cell. Biochem. 104: 
286-294 (2008) (Exhibit 2). There also exist classic examples of serum biomarkers for other types 
of cancers, including the Prostate Cance Antigen (PSA), carcinoembryonic antigen (CEA), and 
carbohydrate antigen 19-9 (CA 19-9). 
Guidance in the specification a nd working examples 

The Examiner alleges on page 6 of the Office Action that "there is no external working 
example which validates the use of SEQ ID NO: 7 as a marker for colorectal adenoma." The 
Examiner apparently recognizes that the specification discloses that clones 8-2d and 12-2f, to 
which SEQ ID NO; 7 corresponds, were up-regulated by 50 and 45 fold, respectively, in adenoma 
tissue samples. However, the Examiner states that the specification is silent on how these two 
clones are actually related, or how these clones relate to SEQ ID NO: 7. 
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In the first instance, Applicants respectfully submit that 8-2d and I2-2f are two different 
clones which were generated from the same starting materials. Both clones express the SEQ ID 
NO: 7 sequence, It is not uncommon that more than one clone expressing the same nucleotide 
sequence was generated. Both clones were identified as corresponding to SEQ ID NO:7 and the 
results obtained using both these clones were consistent, with 8-2d showing a 50-fold increase 
relative to mean expression levels of normal tissue, and I2-2f showing an average of 45-fold 
increase relative to mean expression levels. Therefore, the Examiner's allegation that the 
specification does not validate the use of SEQ ID NO: 7 as a marker of colorectal adenoma is 
unfounded. 

Applicants are providing herewith additional support for the use of SEQ ID NO: 7 as a 

marker of colorectal adenoma. As discussed above, SEQ ID NO: 7 corresponds to the KIAA1 199 

gene. Exhibit 3 is an extract from a company report prepared in July of this year, which relates to 

KIAA1 1 99 and demonstrates that it exhibits a 25-fold higher mean expression in adenoma than in 

normal tissue when tested across 19 patients and 30 normal individuals. Applicants provide the 

following additional information in relation to the data presented in Exhibit 3: 

(i) The analyses which were performed looked at three regions of KIAA1 199, of which one 
was the SEQ ID NO: 7 region. All three regions produced results which confirm that the 
expression of KIAA1 199 was significantly increased in adenoma versus normal tissue. As 
can be seen from the results, there do occur a few isolated samples in the normal tissues 
which show a higher level of KIAA1 199 expression as there occur a few adenoma tissue 
samples which show lower levels of KIAA1 199, However, the mean results clearly 
indicate a very significant increase in KIAA1 1 99 expression relative to normal tissue, fully 
supporting the claimed methods based on detecting an increased level of expression of a 
nucleic acid comprising SEQ ID NO:7 as "indicative 11 of the onset of an adenoma. These 
results in Exhibit 3 are consistent with the results provided in the specification and 
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conclusi vely demonstrate the value of screening for increased expression of a nucleic acid 
molecule comprising SEQ ID NO; 7 (KIAA1199) as a marker of adenoma onset, 

(ii) These results confirm that one can screen for any region of KIAA1 1 99, being the gene 
comprising and characterized by SEQ ID NO: 7. 

(iii) These results are also consistent with the results obtained and described in the specification 
using clones 8-2d and 12-2f, although not precisely identical in terms of overall fold 
increase* Those skilled in the art would appreciate that within any screening system there 
will always occur a certain standard of deviation where one performs the same experiment, 
with the same reagents, two or more times. However, both the results presented in Exhibit 
3 and the results described in the specification support the conclusion that increase in 
expression of a nucleic acid molecule comprising SEQ ID NO; 7 is indicative of adenoma, 

The unpredictability of the ar t and t he state of the art 

The Examiner has again reiterated that the claimed diagnostic method relates to an 
extremely unpredictable art. In this regard, the Examiner has referred to results obtained in relation 
to prostate specific membrane antigen (PSMA). However, Applicants respectfully submit that the 
present application does not concern PSMA. Rather, it concerns a nucleic acid molecule 
comprising SEQ ID NO: 7, Based on the data provided in the specification and in the exhibits 
attached hereto, expression levels of a nucleic acid comprising SEQ ID NO: 7 have been shown to 
provide strong indication of adenoma development. 

Further, Applicants respectfully submit that analysis of both nucleic acid expression and 
protein expression is widely used in many different diagnostic disciplines as a marker of the onset 
of certain disease conditions. The fact that such screening assays are so widely applied would 
suggest that the art is in fact not unpredictable and that PSMA, for example, may represent the 
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exception rather than the rule. To this end, the Examiner's assertion that it is highly unpredictable 
what level of expression of SEQ ID NO: 7 must be observed in order to conclude that adenoma is 
present, particularly in light of the additional data provided in Exhibit 3, is arguably preposterous. 
The data clearly teach that a level of expression of adenoma above that of a normal level is 
indicative of adenoma. There is no need to focus on any specific expression level, rather, merely 
a level which is higher than the normal level, as would be understood by those skilled in the art. 
There is no extensive unpredictable experimentation which would be required to be undertaken in 
order to diagnose cancer. To this end, Applicants also attach an article by Sabates-Bellver (MoL 
Cancer Res. 5: 1263-1275, 2007) (Exhibit 4), which was published subsequently to the filing of 
the present application and confirms that those in the art consider KIAA1 199 as extremely 
important in the context of the diagnosis of adenoma. 
Quantity of experimentation 

The Examiner contends that it would require extensive experimentation before those 
skilled in the art could practice the claimed invention. In the first instance, the Examiner's 
attention is directed to the fact that reference to functional derivatives, variants, homologues and 
the like has been deleted from the claims. Although the claims encompass those that hybridize at 
levels of high stringency to SEQ ID NO:7, these hybridizing molecules must share a high level of 
identity with SEQ ID NO:7, as discussed above. One would expect that these hybridizing 
molecules would be molecules exhibiting minor variations in sequence, such as different isoforms 
of the molecule. To the extent that such isoforms may exist, the data presented in Exhibit 3 
certainly indicate that across 19 patients and 30 non-diseased control individuals, irrespective of 
the form of SEQ ID NO;7 which they may express, the levels of KIAA1 199 are increased in 
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patients who have undergone the onset of an adenoma. 

The Examiner further argues that it would require extensive experimentation to screen 
for a SEQ ID NO: 7 expression product. Applicants disagree with this assertion. Once a DNA 
molecule has been identified, its encoded expression product is routinely identifiable and can be 
analyzed by routine procedures. In this connection, once the recognition is provided by the present 
invention that the gene characterized by SEQ ID NO: 7 is diagnostic of the onset of adenoma, 
experimentation relating to establishing that SEQ ID NO: 7 characterizes KIAA1 199, and 
experimentation relating to electing an appropriate form of KIAA1 199 (e.g. DNA or protein) to 
screen for, or determining how screening assay should be conducted, are well within the scope of 
those skilled in the art, and would be a matter of routine procedure. In this connection, Applicants 
respectfully submit that additional experimentation is permissible. In re Wands, 858 F.2d 731, 
736-737, 8 U.S.P.Q. 1400, 1404 (Fed Cir. 1988). Necessary experimentation is not determinative 
of the question of enablement; only undue experimentation is fatal under the provisions of 35 
ILS,C« §112, first paragraph. Id. In the present case, Applicants submit that any additional 
experimentation, if needed, in order to practice the claimed invention is routine and not undue. 

In sum, it is respectfully submitted that the evidence and guidance provided by the 
specification, together with the knowledge of persons of ordinary skill in the art, clearly enable 
those skilled in the art to practice the claimed methods, without undue experimentation. Therefore, 
it is respectfully requested that the Examiner withdraw the rejection based on the alleged 
non-enabling disclosure provided in the specification. 



35 U.&G § 112, First Paragraph - Written Description 

Claims 1, 5, 13, 32-33 and 83 are rejected under 35 HS.C. § 1 12, first paragraph, for 
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allegedly not satisfying the written description requirement. The Examiner's rejection centers 
around the genus of nucleic acid molecules encompassed by the claims which are employed in the 
claimed methods. 

Applicants first respectfully submit that the present claims do not include any reference 
to "functional derivative, variant or homologue". With respect to molecules that are capable of 
hybridizing to SEQ ID NO; 7 under high stringency conditions, as submitted, this is a relatively 
small and well defined genus, as nucleic acids that hybridize to SEQ ID NO: 7 under high 
stringency conditions must share a very high sequence identity with SEQ ID NO: 7. 

Applicants further respectfully submit that nucleic acids claimed based on 
hybridization language may be considered to have met the written description requirement, 
because highly stringent hybridization conditions dictate that the species within the claimed genus 
are structurally similar, i.e., similar in sequence to the recited sequence in the claims. See, Enzo 
Biochem. In,c. v. Gen-Probe Inc. . 323 F3d 956, 967-968 (Fed. Cir, 2002). 

Accordingly, Applicants respectfully submit that those skilled in the art would have 
considered that the specification adequately and clearly describes the genus of nucleic acids as 
presently claimed, and that the specification has conveyed to those skilled in the art that Applicants 
are in possession of the claimed genus. As such, it is respectfully requested that the written 
description rejection under 35 U.S.C. § 1 12, first paragraph be withdrawn. 
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Conclusion 

In view of the foregoing amendments and remarks, it is firmly believed that the subject 
application is in condition for allowance, which action is earnestly solicited. 



Respectfully submitted, 




Xiaochun Zhu 
Registration No. 56,3 11 



SCULLY, SCOTT, MURPHY & PRESSER, P. C 

400 Garden City Plaza-STE 300 

Garden City, New York 11530 

(516) 742-4343 

XZ;ab 

Ena; Exhibits 1-4 



30 



H:\work\536\t 7530\Ametid\l 7530.am3.doc 



EXHIBIT! 



Executive Summary 



The alignment search tool, BLAST, is available through NCB1, w/w^utIo^ 

The "Nucleotide blast" 1 tool, BLASTn will search a NUCLEOTIDE database using a NUCLEOTIDE 
query. 

SEQID7 (appendix 1) was queried using the standard settings In the NCBl/BLAST/blastn: 

- 8 BLAST hits on the Query sequence 

- Two significant hits were obtained Human Chromosome 15 genomic contig reference 
assembly NTJ094.1 6 and NW_001 83821 9.1 

The remaining 4 hits had less than 4% query coverage. 

The alignment of SEQID7 and the top hit, NT_010194.1S is given in Appendix 2 

- Opening the link the matched reference sequence NT_01 01 94. 16 will result in a graphic 




The KIAA1199 gene Is in map region 51,862,032-52,034,319 on Chr15 
SEQJD7 aligns to the map region 51,082,643 - 51,915,748 



littp : //b]ast.t)cbLn1m.mh.gov/Bta$t.cgi?PAGE=Nuclwtid es &PROGl^ 
PAGE TYPE=BiastSeareh&SHOW„DEFAULTS=on . 



APPENDIX 1 SEQSO 7 NUCLEOTIDE SEQUENCE 

aagc^ctctcaacogaaacctccccaggggctac^gtggcctttccatgtggctttctcaeagcatgttggctgtgttccaatggtgaaegt 
ccacagagagagagagagaccrcagtggaaggcacatcatttttctaaacgactcctggaagttacacttctgctccatcctgggactacraggg 
cacataacaccatgccagctaattttatgtgtgtgtgtgtgtgt^tgtgtgtgtgtgtgtgtgtgtgcgtagacgggatctcatcatcffcgcc 
ca^gttagtcttggactcctggactcaagtgatcctcccatctcaacotccaaaagtgctgggttacaggcatgagtcactgcacctggctgg . 
aaatttgttaatagcctatgttgaaggggtagctgaaatcacctcaccatcctctgggtttccagagcacctccattcttatagcccatgtga 
gtcgtatggtggggtgtggttctatgtcctttctcgccctctgtctgggactgccgagagagcaggtctcatgtcatttatttgtagagtctc 
agggcctattgcaggatetgacagagtcaatgactgctacctttgcggaatgaatgaataaaatcattaatggctgaatgtggctggcttttc 
cacgccttcccacagctggggtacttaatattggctgaggcaactacttttaaactgttggtatttctctttaataaaatct-tgggaaaacct 
tga c 1 1 1 ca 1 9 t ca t tttac t tt gggac t tt t tccaaa atccaggc tt t a t ttttca tcaaac acat g tea t ga t cat g c tq ttagggagtc t 
1 1 ac aa a scat ca teat gc tc tt gagggaat c 1 1 ttga aaacctt ac t ttaga tc aga gt t a gagaagaaa t tcac attc t aa taga t tt gca 
gggtaattgatattctcgccatctctgttcatatttgaaatatttcagtactcgatgtaggggcaaaaacattgagtttacacctt:ctaataa 
ctttccaaaaacctgttataaagtaaaactgctgattcagaggtttggggatctctggggatacagctcagccttggggcccagggoctactig 
tagctgggctacaccttcctctccagcttcttgtccagctgcttctcccttctgttttagactctageaacatcctaggattgttatggtcct 
gttgatgcaatgctgcttcttgccatcttgctgctgtaaatgctgctttctctgctcaatcatctagcaaactaccattcattcttcctgace 
ctgctgaggcatccccttctctgtgaagagtteoctctctccttctccaatgtatcagtaagctattgctgtgtaataaaccaccccaaaggc 
agtggcttgaaacaactgtgtattattgtcctgtgggtcaaccagctggttctgctgatccggacaggcttggctaatctcaactctgttttg 
tatctatcggcagaacaactggaggctggctggtctaggatggectcatttatgtgtttggcattggctagcjtctcaattcagtggatgaggg 
. tgactggaccatgcgt^ctctcatcacccagtagaqtagcctgggtttgttctcgaggtgactgatgctgttctgtgagagagaaggaaagcat 
gcatggcctctggaggcctggatctcaaaacccaatggcacjaccagcacttctgctgctttcttttggctgaagccaggtcagttcatgttca 
aggggaggagatttagactctaccttttaatgggagaagctgcatagttacattgcaaaagacaaggatctgaggggaggagagaaggatgag 
tggaatggttgattgatttttgtgatcaatocaceacacccacctttgatagaggtacttactctgtagtacaattggccttccatactgtgg 
gttccatacctatagatteaaccaattgcaaactgaaaatatttgaaatatgt^ttgcatctgcactgaacatgtacagactatttttcttgtc 
cttacaggataataatacgggataaeaaotattgacaaagcatttacattgtattaggtattataagtaatctagagatgatttaaagtatac 
agga ggat g ta t g t atgt tatatgeaaatact ac actct tt tata ttagggac ttgagc ate tggagagtgtggtatc tgagggagt tc c tgg 
aactaatgtgcagatgccaagggacaactgtactattgtacttggaagtactcatggggtcatattgcattgtttctttgagtcctaattctg 
ccaacatggcctggtgcttgcattaatcagctttctaatctctgagtaacaaggcacagtaacaaggagcagtaacaaggcacagggctggca 
cctgagagtggaggtacccaggaggcagacaccataaggcgggaaagggacatatgtacagaatcatggctgeatgtcctgaagcctggctta 
agccatcaacggotgctgggcaggggccaaagccctgttatccctttcgcocttectgatggctctgcctctgccttcagctgggcgtgggca 
ggccccaccoacegaggotccagcccttacccacagtgtcagcaatgdagcetceagaggatgtgctcaggccctgcccacacacccggatgt 
tgacaggggcatgactecagcgccagctctaatggatggtctaatcgottttaaaataatgaccatggggcgtgggctggegagagcagtgac 
atcactttcctgcaattctgggtcagttcctgctgcttttctctgtatgtttgaatgactgaaataaatctattggttggatatatttcctgg 
aagacttctgacatgttcacatgcctatcttggaatgtggtcaggagagcaatggctttggacttagaggtcctgggttcaagattctgctac 
tagcttgctgtatgaacttggaetagcaacttaacttctccaggcctgtgttttctcatttgtacaatgatgggaggaatacccttggttttg 
taaaggaatggtgaggaogaactgggatctcttgtcagagacactgtcttmgtcagtttgggetgctataacaaagttccacagattaggtgg 
. cttgttaacagcagaaatatatttctcacagttctggaggctagaagtccaagctcaggatgccagcatggttgggctctggagggctcctaa 
acaccattattcttcattcacgcttctcagagccctaaggaagagagtgattcctcagctcraattgtgaactgctcctgccactctgtacttc 
etc gt gtaaaga a rccagactt t aca tcatgggt gacc a ctcccgc agag ttg tacagaacctccct t ggggecacaggat ggctggat tctg 
tcccctcatatacaaggaggttattgggaDagcatttctccctagaacaagagtgtatatttcagaaagctatggatgacttcccatggtcat 
cagatcractaggcaggaatgctattctcctgatagatgtgtggaaagtattcaattcaattttgacccaaagttctaggcactggattaagaa 
atgccaaacccaaaacgtttaactttagaattaaaaaaaaaaaa 



APPENDIX 2 BLASTn QUERY RESULT of SECMD7 

> ref|KT 010194 . 1$ JL035 1 Homo sapiens chromosome 15 genomic contig, reference assembly 
I,ength=S3619965 

. Features flanking this part of subject sequence: 

45612 bp at, 5 ' ..si ^i^ ypoth'eticql protein LOC58489 
704D5 bp at 3' sidet KIAA1199 

Score » 6408 bits (3470) r Expect = 0.D 
Identities - 346&/307 (99%), Gaps ■ 3/3497 (0%) 
Strand=Plus/Plus 

Query 373 CTCGGACTACAGGrcfcCfcTtoCjkCC™ 232 

minimi imiiiiimmmimmmiiimmniHmm 

Sbjqt 51882*4 3 CTGGGACTACA-G^CAirAACACCAt'CCCftGCTAAr^^ 51862703 
Ouety Z 33 GTGT6TGTGTGTGTGTGTerrGCG?AGACGGGATCTCATCA^ ZS2 

1 1 ii r i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mninnnmiiiimmmmmim 

Sbjct 53682702 OTGTaTCTGTGTG?CTGTGr--GTAGACGGGATCTCATCAl^rrGCCCAGGrrAGTCTTC 5158275? 
Ooejry Z$ 3 C-AC^?<XSACTCAAG^ATCCTCCCATCTCAACCTCCAAAAG7GCTCGG7TACAGCCAT 352 

tmmtmimHimiimnmmmHmimnmmmir 

Sb^Ct 51 Bfi 27 $0 AACTCCTGGACyCAAG^ATCCTCCCATCTCAACCtCCAAAAGTGCTGGGTOCAGGCAT 51882819 
<jq©ry 353 ' GACTCACTG C ACC TGGC WGAAJ^TTTGTTAATAGCC^ATGTTGAAGGGCT AGCTG AAATC 412 

fiiiiMniMiiMiiiiiuMsuiiinMnMiiiunuihJjiiiiiii 



Oaory All 

Sbjct 518B28B0 
Query 473 

Query 533 
&b|Jct 51 883000 

Sbjct 51S830&0 
QVQty 653 

Sbjct 51883120 
<?uery ?13 

Sbjct 5 1883 a 30 
Query 773 

Sbjct S18B3240 
■ Query. 833 

Sb^trt S3 88 3 300 
Query 853. ■ 

Sbjct 5i8B33W 
Query 953 

Sbjct 51863420 
Query 1013 

Sbjct 518834 60 



G AG TC ACTG CACC TGGC TGGAAATTTGTTAATACCC TATGTTGAAGG GGT AGC TGAAA7C 5188287? 
ACCTC ACCATCCTC TG GGTTTCCAG ^CCACCTCC ATtCTTJiTftGCCC ATG TGAGTCGTAT 4 32 

j i n 1 1 1 n i n 1 1 n ii i n 1 1 a m m f m m 1 1 1 1 1 n 1 1 1 i i n 1 1 1 1 1 1 1 n i i n 

ACCTCACCATCCTCTGGGT^TCCAGAGCACCTCCATTGTTATAGCCCATGTGAGTCGTAT 
CKJTGGGGTGTGGTTC^ATGTCCTT^TCGCCCTCTGTCTGGGACTGCCGAGftGAGCAGGT 

M U 1 1 1 i f M I i i M ) 1 1 1 1 H 1 11 1 1 1 i H I i 1 1 M M I i ! ! I ! U H M 1 1 1 PI E 1 1 

GGl > GGGGTGTGGTTCTATGTCCT*rrCTGGCCCTCTG , rCTGGGACTGCCGAGAGAGCAGGT 
CTC ATG^C ATTTAtTTGTAGAGTC^C AGGGCCTATTGC AGG ATCTG ACAGAGTCAATGAC 

9 E 1 3 II M 1 1 1 1 1 i I ! 1 1 r 1 1 U 1 H i M 1 1 1 1 r I i I f 1 1 1 1 1 ! 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

C TCATGTC ATTTATTTG^ AG AG^TCAGGGCCTATTGCAGGATCTGAC AGAGTCAA^ G AC 
TGCTACCTTTGCGOAATGAATGAATAAAATC 

iHiiiHtiiriiiiiiiiiiiiiiiHiiiiimniMiMmiiiiiiiniu 

TGCTACCTTTGCGGAATGAATOAATAl^Te^ 
MXCCrTCCCACASCTC&GGTACTTASTATT^^ 

in iMiimMhimmniiiiimiiiiiiiumiiiiimiiiim 

ACGTC TTCC C ACAGCTGGGGTAC7T AATATTGGCTGAGGCAACTACTTTtAAAC TGTTGG 
T AT TTCT CTTTAATAAAATCTTG GG AAAACC TTG fcCTT^ ATCTC ATT T7ACTTTC0GAC 

[iinniiiiinnimiiuiiiiiiiiiit n minimum 

rtmCC*AMTCCilGGCW^l^K:ATCMAC»CATCTCfcTCATCATCCTCTTAG<M 

titiiiiiiiiiiiiiiiitMiiiuiitiiiiiiiii! imiiniiiiin 

7TTTTCC AAAATCCAG GCT77ATTTTTCATCAAACACA TGTCATGATC ATCCTCTTAGGG 
AGTCTTTAC AAACC ATC ATCATGCTCTTG AGGGAA7C7TTTG AAAACCTTACTTT AGATC 

i I ! II E I II 1 1 H Ml I H Ml it M MI HI II 1 11 U i HI i 1 H 1 11 3 1 1 1 i 1 1 III 

AGTCTTTACAAAeCA7CATCATG£TCTTGAGGGAATCTTTTO 
. AGAGTTAGAGAAGAAATTCACAT^CTMTAGATT7GCAGGGTAAT^ 

1 1 1 1 M 1 1 U I M M ] 1 1 1 1 1 M E 1 1 f 1 1 U 1 i M M M 1 1 i 1 11 1 1 1 1 1 1 M 1 M i 1 M 

AG AGTTAG AGAAGAAATTCAC A77CTAAT AG ATTTGC AGGGTAATTGAT ATTCTCGCCAT 
CTCTGTTCATATtTGAAATATYITCAGTAGT^ 

M M 1 1 EE 3 1 E 3 M 1 1 1 11 MM 1 1 1 1 1 1 J 11 MM M M i E M 1 41 1 M M I M I MM 

CTCTGTTCATATTTGAAATATT^AGTACTCGATGTAGGGGCAAAAACATTGAGTTTACA 
CC TTCTAATAAC T1TCC AAAAACCT'GTT A ? AAAG? A AA A C T C Cf G ATTC AG AGGTTTGGG 

illHIIIMIIMMIIIIIIIIMMIIMIlllMIMI 1 1 M U i li M i 1 1 1 i 1 

CCTTCTMTMCTTTCCAAA*ACCTGTTATAAAGTA^ 



51882935 
532 



5JBB2999 
592 



51883039 
G52 



51883119 
712 



51683m 
772 



5A883239 
832 



51&B3299 
892 



53B833S9 
952 



51883419 
1012 



51663*79 
1072 



51B83533 



Query 1073 



5b Jet 
Query 

Sbjct 
Query 

Sbjct 
Query 

Sbjcfc 
Query 

Sbjct 
Query 

. Sbjct 
Query 

. Sbjct 
Query 



51883540 
1533 



51683600 
1193 



S1883G50 
1253 



51883720 
1313 



51883780 
1373 



51883840 
1433 



5L883900 
1493 . 



GfcTC TCTG G G G atacagctCaG CCTTGG ggcccagggcctac CGTAGCTGGGC tacacct 

H M M E 1 HM 1 M I.MI Ml I M I E H J 1 HUI M 1 II 1 1 1 i 1 1 IHM Mil MM 

G ATC TC TGGGGATACAGCTX AGCCT1\3GGGCCCACGCCCTACCGTAGCTGG^C^ AC ACCT 1 

TccretccAGcrrcrmTCCAGCTGcrrc^ 

tllMMiMIIMMIIIlliniHINIIIIMnHMHIMirMMIMMII 

TCCrCTCCftCCTTCTTCrcCACCTCCTTC^ 

AGG ATTGT7 ATGGTCCTGTTGATGCAATCC TGtl TTC TTGCCA7CTTGCTGCTGTAAATGG 

IMIMnMiMIMIMIIIllllMMIUIUUIIHIlMIIMMMHIlM 

AGG ATTGTTATG GTCCTCTTG ATGC AA7GCTGCTTCTTGGC ATC TTGCTGCTG7AAATGC 
TCKTTTCTCTGJCTCAATCATCTAGCAMC^ 

I II 1 1 1 1 1 E U E H U II M 1 1 1 1 H E I i 1 1 H E H E E I It ( H E IS U 1 1 1 i 1 i H 1 1 1 

TGCTTTCTCTGCTCAATCATGTACCAAACTACCAT^^ 

atc cccttc tctg tg aagagtt/Ccctctctcc ttctccaatgtatc agtaagctattgct 
E I M L I E M H E 1 1 1 11 1 1 1 1 II E M 1 1 H H i i 1 1 1 E 1 1 f 1 ] II 1 1 1 M E I E M E 1 1 1 1 
atgccctotcwtgaasacttccctctc^ 

GT<?TAATAAACCACCCCCAAGGCAGTGGCTTGAAAC^ 

miiiimjuiMimHiiiiiiiiiimiiiiiiHiiniimiiiimi 

GTGTAATAAACCACCCCCAAGCCAGTGGCTTGAMCAACTGTG^ 

TC AAC C AGO TGGTTCTGCTG A.TCCG G Fi CA G G C TTG G C T AAT CTC AACTCTGTTTTGT hT C 

mMinMiHiminmimuHiiiiiiiiniMmsmMmmi 

tatcgg^agaacamtggaggc^^ 



1132 



U32 



51883659 
1252 



51863719 
1312 



51883779 
13T2 



51883839 



51883899 
H92 • 



51883959 
1552 . 



1 1 1 1 i 1 1 ii i n 1 1 1 1 1 n i m n 1 1 i 1 1 1 1 n m 1 1 i 1 1 1 ii i m 1 1 1 1 n u m m i 

Query TGG CT AGCTCTGAATTC AGTGG ATC AG GGTGACTGG ACCftTGCGTCTCTCWfrCACC C AGT 1&12 

u n i n u i i it n n 1 1 h i i n u 1 1 ii h n h n i n 1 1 1 1 1 1 ! n i n n n i j . 

Sbjcl 51684020 TGGCtfAGCTCTC AATTC AGTGG ATGACGGTC AC TGG ACCATOCOTCTCTCATCACCCACT 5 1 8640 7 9 
Query 1613 "AG AC T AGC C TGGGTTTG TTCTCG ftG G7G AC T G ATGCTG7TCTGTG AG AC AG AAGGXA A G C X672 

iimmimiimimnmnniimimmHmimununi 

jet & 1 8 84 0 8 0 AG^CYAGCC^GGGTrTCTTCTCCftGCTGACTGATGCTGTTCTGTCAC^CAGAAGGAAAGC 51884139 
Query 167 J ATGCATGGCCTCTGGAGGCCTGGATCTCAAA^ 17 32 ■ 

mui i unt tnniH i m ur NiMiuni i iiuu in M I in uti un ■ 

Sb^Ct 51ftfl4140 ATGCATCGCCTCTCGAC^C1?GCATCTCAAAAC^^ $1884199 
Query I'm TTTCTTTTGGCTCAAGCCAG GTCr ACWTC ATG^C AAGGGGAGG AGATTTAGACTCTACCT 17 92 

: 1 1 1 1 1 i 1 1 1 1 i I II 1 M 1 1 ^1 11 ! E I M 1 1 1 i 1 1 1 1 i M t M 1 1 1 1 M 1 1 1 M L M I M 

SbjCt 518&42O0 TWTOTGGCTG^ MBB4259- 
Query J 793 mAArGGGAGAAGCTGCATAGTrACAT^ft^ 1BS2 

■ 1 1 1 1 11 1 1 i II I M 1 II 1 1 1 1 1 i 1 1 11 1 1 1 1 1 M 1 1 1 1 E 1 1 1 1 It I M 1 1 1 1 1 1 1 1 1 M I ■■■■ 

Sbjct 5 188 4 260 tTTMTCCGAGAACCTGCATAGTTJ^ATTGCAAA^ 51684319 
Query 1853 AGGAllGAGTCGAATGGTrGATTGRTttTTGTGATCAATCCACCACACCGACCTTTGATAG 19 12 

. 1 1 ii 1 1 [ 1 1 1 n I J u m 1 1 1 1 1 m 1 1 1 1 1 1 1 1 i m 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 f i a i J 1 1 1 1 : 

Sbjcfc 51884320 AGGATGAGTGXJ AATCGTTG &TT GATTTTTGTC A'tCAATGCACCACACCC ACCTTTGATAG &1B84379 
Quury 1913 AGG^AC^ACTCTGTAGTACAATTGCCCTTCCRTACTGTGGGrtCC ATACCTATAGATTC 1972. 

i n 1 1 1 m i c m h ii n n 1 1 1 1 1 n n 1 1 1 1 1 n i n 1 1 1 m m n i n t m i n ] u . 

S&tfct " & 3 B B 4 m ACGTACTTACTCTCTWTftC AATTGCKCTTCGATAC'eGTGGG'rrCCATACCTATAGATTC &S8S4439 
Query 1973 . ; AACC AATTGC AAACTGAAAATATTTG AAATM^GTTTGCATGTGGACrG AACATGTAC AG A 2032 ■ 

.iiiuniiiiiiiiiMiiiiiiiiii.iuiiiHHiiii!iiiiiiiinuinii :■ 

Sbjct. 51SS4440 AACCAATTGCAMCTCAAAATArTOAAAT^^ 5:694499 
Query 2033 CTATTTTlOTGrCCTTACAGGAT^^ 2092 . 

ni 1 1 1 1 1 j k n nil i ii n 1 1 1 m m 1 1 1 1 un i li ii i i 1 1 m ii 1 1 mm i n r 

. . Sb^cfc 51884500 CTATTTTTCTrrGTCC^^ 5188455? 
Query 2033 - ■ 21 $2 

n m m u 1 1 n 1 1 1 n 1 1 1 n i n f i n ii m u ii ii 1 1 m m n 1 1 1 n i u 1 1 1 1 i ; 

Sbjct $1884560 CACTCTATTAGGTATTATAAOTAATCmACAfPCATTTAAAGTATACAGGAGGATGTATG 51B84619 
Qua cy 2 1 5 J TATGTTftTATGCAWACTACACTCTTT^ 2 * 1 * 

imiimmmmMummiiiimminniiiiiiminmii. 

Sbjcfc ' S18B4G2Q TAl^TTATATGCAAJlTACT AC ACTGTTTTATATPAGGG &CTTCACC ATCTGG ACAGTGTG 51&84679 
Query 2213 GTATCTGAGGGAGT^CCTGGAACTAATGTGCAGATGCCAAGGG^CAACTGTACTATTGtA 2272 ' 

U i 11 1 M H 1 1 M H E M II H II 1 1 1 1 H I M I i I U U H H I M H Ii I i i H 1 1 1 

Sbjct 51884680 GTATCTGAGGCACTTCCTCGAACTAAT^GCAGArcCC^ 51884739 
Query 237 5 c^SAAGTACTCATGGGGtfCATATTGC 2332 

1 1 i 1 1 II 1 1 M I i 1 1 1 1 It 1 1 1 M M 1 U i 1 1 M 1 1 f i 1 1 1 1 1 1 1 1 1 1 1 K 1 1 1 M i M 1 1 

Sb j c t 5 1884 ? 40 . CTTGGAAG*ACTECMKGGGTCfc?A?TGCA™^ 5 1 86 4 7 $9 

Query 2333 GGCCTGG'IGCTOCATTAATGAGGmCTAATCTCTGAGTAACAAGGCACAGTAACAAGG 2322 

iiunniMttHiiiiiiiiiiiiiiiiMiniiitnniiiiiiiiiiiiitii 

SbjCU 51884800 CGCCTGGTCCrTCCATTAATCAGCmC^ 51884859. 

Query 2593 agcagtaacaagc^acagggctggcacctcagagtggaggtacccaggrggcagacacca 2452 

iiiiiiiiiiiiiiiininnnnnttiEMMiiiiiiUHiiiiuiiiiii! 

Sbjet 51884860 AG<^GTAACAAGGCRCAGCGCTGGCACC1TOAGAGTGGR6G^ACCCAGCAGGCAGA<:ACCA 51884919 
Query 2453 T AAGGC G G G AAA G G G AC ATATG TAC AG AATC A't'GG C f GC A T G 7CCTCAAGCC TGGC TTAA 2512 

iiiiiimmimmmnmimiminiimnimnmiiiiii 

• Sbjct 518&4920 TAAGCCGCKWvAGGGACttTATX^^ 
1 Query 2513 GCC ATC AACGGCl^XTGGGCAG GGG CC AAAG CC CTGl^ATCCCTTTGGCCCTTCCTG ATG 2572 

n n I n i i i h m i m ! 1 1 n n 1 1 1 n i ii u i i n t i 1 1 1 n j 1 1 1 1 n i h i J I n : 

Sbjct 51B84980 GCCA*CAACGGC?GCTGGGCAGCCCCCAAJiG^^ 51&&503S 
Query 2573 : CcrcrcCCTCTGCCTTCAGCTGGW 2632 

niJinnnnittniiiiHinnnini iHinnniiiiuiniiiiii : : . . 

. Sbjtb 51685040 GCTCTGCCTCTGCCTTCAGC1X3GGCGTCGGCAGGCCCCACCCACCGAGGCTCCAGGCCTT 51885099 
Query 2633 /.• ACGCftCAGTGTCAGCAATGCAG^CTCCAXSA 2692. . 



■ . n i h i 1 1 i 1 1 1 1 1 1 1 n » 1 1 1 1 M i n i i n n i i n 1 1 u n 1 1 1 1 1 n 1 1 i n 1 1 n 

Sb}ct SI 685X00 ACCCACW^TCAGC^TGC^^ 51885159 
Query 2693 GATGtTGACAGGCGCATG^^CCAGCOT^CTCTWTOGATGOTC^CAtCGCTTTTAM 2752 

II II 1 1 1 £ H I U 1 1 1 i i i Ul E 1 1 i 1 1 1 3 M i 1 M 1 1I U U I Ul M 1 1 1 Ml ! 1 i M 

SbjCt &1885H0 GATGTTCACWGGGCATGACTCCftGCCCCAGCTCTAATCGATCCKTCATCCCTTTTftM 51B8521& 
Query 27 53 ATAATGACCftTGGGGC<JTGGGCTGGCGAGAGCAC?GACATC^CTTTCCTGCAATTCTGGG 2812 

n 1 1 1 i 1 II H U 11 1 1 n II 1 1 1 1 IUf I If i I i 1 M m f 1 1 1 1 i 1 1 1 E M 1 1 1 11 1 L 

SbjCt 51885220 ATAATGACGATGGGG CGTGGCCT^3GCG AGAC CAGTGACATCACTTTCCrGCAATTCTGCG 51365279 
Query 2813 TCMTTCCTOCTiaCtfWCTCTQTKWPT*^ 2S72 

n m n M U I n (I ! t H 1 J 3 1 1 1 1 1 1 M 1 1 i U M I f M I IE I ! E P t i I i J 1 1 ! 1 1 1 1 

$bjct 51885280 TCAGTTCCTGCTGCTmCTCTGTAVGTTt'GAATGACTGAAATAAATCTATTGGTTCCA? 5188533? 
Query 287 3 ATATTTCCTGCAAC ACTTCTG AC ATGTTC AC ATGC C TATC'T'tGGAATG-T GGTC AGG AG AG 2932 

U 1 1 1 II I i H U U 1 1 n U E I i U M 1 1 1 1 1 II II t M 1 U 1 1 1 i 1 i U Tl I M 1 1 1 1 

Sfajct ATATXTOCTGGAACAC TTCTGACATGTTCAC ATGCCTATCTTGG AATGTGGTCAGGAG AG 5158539* 

Query 2933 C AATUCCTTf GO ACT T AGAGGTCCTGGGTTC AAC ATTCTG C TA C T A G C TTCCTG'? A?G AA 2992 

IIEIIIlMlllltlllllllJUIIIIJIIIIMIIIItllllllllllilHItntl 

. Sbjct 51885100 CAATGGCr7TGGACTTAGAGGTCCrGGC?TCAAGATTCTGCTACTAGCrKX:TGTATGAA 51885*59 
Query : 2993 CTTGGACTAGC AACTTAJ^CTTCTCC AGGCCTG^GTT^ITCTC A^WGTAC AATG ATGGGAG 3052 

. iiiiifiiiiiniiinnittiiiiiiiMiiiuEiiiiiiiiiiiiiniMnii ■ • 

Query 30 5 3 G^ACCCTTGGTTTTGTAJ^GGAAtGGTGAGGACGAACTGGGATCTC^GTCAGA<JACA 3112. 

i ! 1 1 1 11 1 1 1 M U 1 1 1 M 1 1 1 1 1 1 E I E 1 1 1 1 1 U I M 1 1 1 1 1 1 1 1 1 1 f 1 1 U 1 1 1 1 1 1 1 ■ 

' Sbjct 51885520 GAATACCC??GCTCrT<tf^^ 
Query 3113 . CTGWTHGTCAGTtTCGS^^ 3171 

\ unfit 1 1 1 1 s 1 1 1 1 ii 1 1 e 1 1 1 1 ( 1 1 1 1 1 ii 1 1 1 m 1 1 1 1 1 1 1 ii 1 1 1 1 1 i 1 1 ii 1 1 . -.. ; 

Sbjct &19B558Q CTCTCTTAGTCAGrrT&CGCTGCTATAACAAAGra^ 51885635 
Quety '3173 GCAGAKATATArn^^ 

1 1 i i ] I L 1 1 M 1 1 1 1 J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E i M M E 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 v : 

,Sbj>t &1885640 GC AG AAATAT ATTTCTC AC AGTTCTGG AGGCT AG AAG^TCCAAGC TCAGG A'tGCC AGCATG 51885699 
Query 3233 GTTGGGCTCTGGAGGGCWCTAAACACCATT'ATWTCATTCACGCTTCTCAGAGCCC'rA 3292 

1H t L n n 1 1 M M E M I M I M UU M I 111! M 1 1 EH 1 n III 11 lltt 1 M I ! E 

Sbict 51885700 GTI^CGCTCtGGAGGGCTCCTAAACACCATTATTCTTCATTCACGCT^CTCAGAGCCCTA 51885759 
Query 3293 &G<*AAGAGAGTGAT1^C7CAGCTCAATTGTGAAC 3352 

iMiinnnuEiiiiMMniiiiifiiiiiiiiiUMiMniiMinniii 

Sbjct 51865760 AGGAAGACAGItSACTCCTCAGCTCAAT^ 

Query 3353 TGTAAAGAARCCAGACTTTACATCATGGGTQACCACtCCCGCAGAGTTGTACAGAACCTC 34X2 

I ! ] I M 1 1 1 1 1 1 i i 1 1 1 f 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 i I r I f 1 1 i 1 1 1 1 1 i I i 1 1 1 1 1 1 1 ! I. . 

. Sbjct 51865820 TGTAAAGAAACCAGACTTTACATCATGGGTGACCAC'fCCCGCAGAGTTGTACAGAACCTC .$1885*79 
Query 3413 CCTTGGGGCC ACAG&ATGGCTGG AT'ECTGTCCCC tC AT ATACAAGC AGG'CTAlTGGGAC A 3472 

1 1 1 1 ! 1 II 11 II 1 If 1 1 1 1 H M H 1 1I I E E I E EH 1H I E II 1M II M IHE H M 11 

Sbjct 51885880 cCT^GGGCCACAGCATCGCrrGGATTC^^ 5)885939 
Query 3473 gCAT?TCTCCCTACAACAAGAGTGWAT1^ 3532 

iininniiiiiiiiiiiiniiiMniiiiiniiiiniMniuiiiiuiii 

Sbjct 51885940 CCATTfC ^CCTAGAAC AAG AGTOTAtAW M AGAAAGCTAtGC ATGACTTCC C KtCCTC 51885999 
Query 3533 AtCAGATCACTKGOlAGGAATC-CtAnCTCCTGATAGAT^GTGCAAAGTATTCAATTCA 3592 

I M II ! I M IS 1 1 L II E 1 1 1 II 1 1 i 1 1 1 1 f 1 1 1 1 U II 111 1 1 1 1 !1 1 E I II 1 1U I U I 

SbJCt 51886000 ATCAGATCACrAGGCACGAAltSCTATTCTCCTGATAGATGI^TGGAAAGf ATTCAATTCA 5*888059 
• Query 3593 AW TGACCCAMGTTCTAGG^^ 

1 1 1 1 n i h i n u 1 1 1 1 1 m i i e i e 1 1 1 m u u n 1 1 1 n n 1 1 1 1 1 1 i m n m i u 

Sbjet 51880060 ATTrTCACCCAAAGTTCrAGGCACrCGATTAACA^ 518&6U9 

Query 3653 tagaattaaaaaaaaaa 3669 

iimiiiiinmii 

SbJCt 51886120 TAGAATtAAAAAAAAAA 51886136 

raatarea fUnkiftg this part of swl>3*ct Bequencei 

7g5Q bp ax..y.*l<to* ^vi3Qthobical,.or e^h„LQCi^M9 
'. 402H^&_??t 3/. side*., KlfefeDM . • 



Score » 29C bits 1160), Expect * le-76 . ■ . 
tdftftfcitifc* * 169/172 (97%). Gaps - 1/175 (0*) 
S I rand -Plus/ PI uo- 

Query 5 CGCTCTCAACCCAAACCTCCCCACGCGC 64 

e 1 1 1 1 u ii 1 1 j m 1 1 s i ii M 1 1 iu i »f!iim 1 1 m ii i liiHiiiuinni . 

8b jet 51 91557* CGCTCTCAACCGAAACCTCCCCAGGGCCTACAGTC^ 51915*33 
QMSXy 65 CfcTGTTGGCTGTGTTCCAATGGrG^GTCCACAGAGAGAGAGAGACACCC-AGTGGAAG 123 

1 1 [ i i I f 1 1 1 i ! 1 1 1 1 1 1 J 1 1 1 1 1 ! J i I i I i 1 1 1 1 f 1 1 li i 1 1 1 1 M L 1 1 1 11111!!! 

Sbjct 51915634 CATGTTGGCTGTGTTCCAATGGTGA^GTCCACAGAGRGAGAGfiGAGfiCCCCAGTGGAAG 5I9J5693 
Qusry 15* GCACATCftTTT^TCTAAACGACTCCTGGAAGlTAC ACT PC7GCTCC ATCCTGG l?fi . 

n j n t n i u n 11 1 m i n i n 1 1 h m f e i e n h n 1 1 r i n a mi ... 

Sbjct S191S694 GCACATCAtTTTTTCTAAACGACTCTTGGAAG^ 51S157 
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The Use of a Colon Cancer Associated Nuclear Antigen 
CCSA-2 for the Blood Based Detection of Colon Cancer 
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Rene Tolba, 3 Lukas Heukamp, 4 Andreas Himer, 2 and Robert H. Getzenberg 1 * 

department of Urology, johns Hopkins University School of Medicine, Baltimore, Maryland 
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Abstract The early diagnosis of colorectal cancer (CRC) is central for effective treatment, as prognosis is directly 
related to the stage of the disease. Development of tumor markers found in the blood from patients, which can detect CRC ■ 
at an early stage, should have a major impact in morbidity and mortality of this disease. The nuclear matrix is the structural 
scaffolding of the nucleus and specific nuclear matrix proteins (NMPs) have been identified as an "fingerprint" for various 
cancer types. Previous studies from our laboratory have identified four colon cancer associated NMPs termed colon 
cancer-specific antigen (CCSA}~2 to (CCSAl-5. The objective of the present study was to analyze the expression of one of 
these proteins, CCSA-2 in serum from various patient populations and to determine whether CCSA-2 antibodies could be 
used in a clinically applicable serum-based immunoassay specifically to detect colon cancer. Using an indirect ELiSA, 
which detects CCSA-2, the protein was measured in the serum from 174 individuals, including healthy individuals, 
patients with colon cancer, patients with diverticubsis, colon polyps, inflammatory bowel disease (IBD) as well as other 
cancer types. With a predetermined cutoff absorba nee of 0.6 OD we have successfully utilized this approach to develop 
an immunoassay that detected colon cancer. The immunoassay showed a sensitivity of 88,8% (24/27) and an overall 
specificity of 84.2% {106/1 27). This initial study showed the potential of CCSA-2 to serve as a highly specific blood based 
marker for colon cancer. Although potentially promising, the results of this study must be confirmed in larger independent 
validation studies. J. Cell. Biochem. 104:286 -294, 2008. ® 2007 Wiley-Uss, Inc. 

Key words: nuclear matrix proteins; colorectal cancer; tumor markers 



Colorectal cancer (CRC) is one of the best 
characterized tumor types in regards to the 
multistep genetic progression pathway that has 
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been elucidated. Despite our molecular under- 
standing it is the second leading cause of cancer 
related death in the United States and the third 
most common cancer after lung and breast 
cancer worldwide [Parkin, 2001]. 

In 2007, more than 153,760 new cases will be 
diagnosed and more than 52,180 people will 
die from CRC in the USA [Jemal et al., 2007]. 
More than 50% of these deaths may have been 
prevented through the use of screening tests as . 
the resulting early detection of the disease 
[Walsh and Terdiman, 20031 The long natural 
history of CRC as it evolves from adenomatous 
polyps in the majority of cases provides oppor- 
tunities for detection of early stage in cancer 
and for prevention of cancer by removal of 
polyps. Despite the potential for screening 
of CRC, only a minority of the population 
currently undergo screening program : (www. 
cancer.org). • 
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The low rate of participation in CRC screen- 
ing is critical to understand and is due to a 
number of actors, including patient discomfort, 
costs, and poor acceptability of current screen- 
ing methods. Compliance to a serum test likely 
be better than tests involving feces and stool 
handling. An effective blood test, which ideally 
has a high specificity and sensitivity would be 
an ideal method to detect CRC and could lead to 
a reduction of the mortality and morbidity of 
CRC, 

In order to identify highly specific tumor 
markers, investigators have focused attention 
on the structural changes that are associated 
with neoplastic transformation. Alterations in 
the cellular and nuclear structure are hall- . 
marks of the carcinogenic process. These alter- 
ations are so prevalent in cancer cells that they 
are commonly used as a pathological marker 
of transformation. Nuclear shape reflects the 
internal nuclear structure and processes and is . 
determined by the nuclear matrix [Pienta et aL , 
1989], 

Most of the nuclear matrix proteins (NMPs) 
identified to date are common to all cell types, 
but several identified NMPs are tissue and . 
cell line specific [Getzenberg, 1994], This struc- 
ture has many important functions like DNA 
organization, stabilization, and organization 
of gene regulatory complexes and synthesis of 
RNA, a variety of functions of which many have 
implications in cancer progression [Konety and 
Getzenberg, 1999]. 

Cell type-specific "fingerprinting" of aberrant 
NMPs and their appearance in cancer develop- 
ment has led to the analysis of NMP composition : 
of a variety of tumors in an effort to determine 
whether these proteins can be developed as 
diagnostic and/or prognostic markers for 
cancer. Previously, we have identified specific 
NMP in prostate, bladder, renal, colon cancer, 
and colon cancer metastasis to the liver 
[Konety et aL, 1998; Brunagel et aL, 2004, 
2002a,b; Van Le et aL, 2004; Myers-Irvin et aL, 
2005; Paul et aL, 2005], This oncological "finger- 
print" can be used as a specific and reliable 
diagnostic test, even when a distinction may 
not be made accurately on a histological basis 
alone [Getzenberg et aL, 1991; Dhir et aL, 
2004a]. 

Our laboratory has recently demonstrated, 
that an antibody raised against the prostate 
cancer associated marker EPCA-2 is a sensitive 
and specific serum test for prostate cancer [Dhir 



et aL, 2004b; Paul et aL, 2005; Leman et aL, 
2007]. Additionally, an enzyme linked immuno- 
sorbent assay (ELISA) has been developed to 
detect a specific nuclear protein, BLCA-4 in the 
urine of individuals with bladder cancer. The 
test has shown to have a 96.4% sensitivity and 
100% specificity [Konety et aL, 2000], 

Our previous studies describe the isolation of 
four NMPs (CCSA-2-CCSA-5) that are speci- 
fically expressed in colon cancer [Brunagel 
et aL, 2002b]. One of these proteins, CCSA-2' 
was isolated by excising gel spots from nega- 
tively-stained two-dimensional gels. The 
gels spots were then concentrated to obtain 
protein sequences and synthesized for antibody 
production. 

Internal peptide sequencing of CCSA-2' 
resulted in four distinct peptides with sufficient 
amino acid sequence data. The four peptides 
along with the most significant matches 
. obtained from BLAST analysis are described 
previously [Brunagel et aL, 2002b], Overall, 
: - while these data suggest that some regions ; 
of CCSA-2 may be common to other proteins, 
there is. a high possibility of it being a novel 
uncharacterized protein. 

The development of antibodies identifying 
aberrant NMPs in CRC could become clinically 
important assay with great specificity. The 
objective of this study was to investigate 
whether the NMP CCSA-2 can function as a 
highly specific and sensitive serum based bio- 
marker for CRC. 

Using an indirect ELISA approach, sera from 
patients with colon cancer were compared with 
serum samples from healthy donors, patients 
with diverticulosis/diverticulitis, patients with 
inflammatory bowel disease (IBD), patients 
with colon polyps, patients after curative treat- 
ment of colon cancer and patients with different 
cancers. 

MATERIALS AND METHODS 

Protein Sequencing 

CCSA- 2 was isolated according to an adapta- 
tion of a technique developed by Gevaert [1995]. 
Two-dimensional gels were negatively stained 
by 0.2 M imidazole and 0.3 Rl zinc chloride. 
Hie staining was stopped, and the protein gel 
spots were excised andfrozenat -80°C. The spots 
were then stained with Coomassie blue and 
concentrated on an acxylamide/agarose gel and 
■ sequenced (Michigan State University). 
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Antibody Production 

A standard protocol was followed in the 
production of monospecific antibodies raised 
against the CCSA-2 peptides in rabbits. Peptide 
sequences were chosen based upon the length of 
the sequence obtained as well as antigenicity. 
The peptide sequences were modified to contain 
a terminal cysteine for coupling purposes and 
conjugated to keyhole limpet hemocyanin or. 
bovine serum albumin to increase immunor- 
eactivity . Antibodies were produced at Biogenes 
Berlin (Germany) under an Institutional 
Animal Care and Use Committee approved 
protocol 

Patients 

Serum samples were obtained from consenting 
patients under an Institutional Keview Board— . 
approved protocol. Serum samples from 
174 patients were analyzed. After obtaining a 
blood sample, patients underwent a colonoscopy. 
Blood was collected with the blood collection 
system 3-Monovette (Sarstedt, Numbrecht, ... 
Germany). After collection, samples were cen- 
trifuge at 4,000 rpm. The supernatant was 
aliquoted in 2 ml tubes (Greiner Bio-one, 
Solingen, Germany,), The samples were stored 
at ™80°C according to GLP (Good Laboratory 
Practice) conditions. 

Of the patients studied, 27 were diagnosed 
with colon cancer/The control group consisted 
of 40 patients with a normal colon as evident by 
colonoscopy, 21 patients with a diverticulosis, 
20 patients with colon polyps, 11 patients with 
an IBD, and 37 patients with different cancer 
types. Additionally nine patients 2-9 years 
after curative surgery for colon cancer were 
analyzed. The patient's characteristics are 
summarized in Table I. 

Indirect EL ISA 

The detectability of CCSA-2 using the anti 
CCSA-2-antibody was assessed using serial 
dilutions of BSA-eonjugated anti CCSA-2 anti- 
serum against known concentrations of CCSA-2 
peptide coated into a 96-well plate. 

Using Nunc Immunoplate Maxisorb plates 
prepared with 50 ]d coating solution (KPL ? 
Baltimore, MD), 50 \il of serum per well, in 
triplicate, was allowed to incubated at room 
temperature with moderate shaking overnight. 
As a positive control, 50 jil of unlabeled rabbit 
: immunoglobulin G (IgG), diluted with 50 jd 



coating solution (KPL), was plated overnight as 
well. The following day, all wells except the 
blank wells were blocked with 250 fil of Super 
Block Blocking Buffer (TBS; Pierce, Rockford, 
IL) for 45 min at 37 c a After blocking the wells, 
all wells were washed 3 x with 250 \xl reagent 
quality water before the addition of the primary 
antibody. The primary antibody for the sample 
wells consisted of 100 jil of diluted polyclonal 
antibody (previously described) in Super Block 
Blocking Buffer (Pierce). The negative control 
wells contained rabbit preimmune serum. Fol- 
lowing a 2-h incubation period at 37°C with 
moderate shaking, the plate was emptied, 
washed with reagent quality water (250 jil, 
3x) y and then secondary antibody was added to 
all the wells for another 2 h. The secondary 
antibody applied was 1 mg/ml goat anti-rabbit 
IgG-horseradish peroxidase (human serum 
adsorbed) (KPL) ? diluted 1:5,000 in Super Block 
Blocking Buffer (Pierce). After washing the 
wells with reagent quality water (3 x 250 jil) ? 
100 id of S^'^jS'-tetramethylbenzidine (KPL), 
was added to each well and allowed to react for 
14 min and the absorbance was read at 650 nm 
on a Safire (Tecan, Germany) micro plate 
reader. 

Statistical Analysis 

The data were compiled as mean ± standard 
error of the mean. 

The normal distribution of the samples of each 
group was controlled by the Kolmogorov Smirnov 
test. To analyze differences between the groups, 
/ one-way analysis of variance (ANOVA) with the 
\ Dunnett's post hoc test was performed. The colon 
cancer group was taken as reference, statistical 
significance was assumed at P < 0.05. 

All statistical analysis and receiver-operator 
characteristic (EOC) curve were performed 
using GrapPad Prism version 4.03 for Windows 
XP> GraphPad Software (San Diego, CA, www. 
graphpad.com). 

RESULTS 

Using anti-CC2 antibodies, an indirect ELISA 
was developed to measure the level of CCSA-2 in 
the serum from various patient populations. The 
average value for CCSA-2 in the serum of the 
27 colon cancer patients was 0,73 ±0.15 OD, 
whereas the average value for healthy indi- 
viduals (control) was 0.53*0.06 OD. Statistical 
analysis demonstrated a highly significant dif- 
ference in serum CCSA-2 levels between the 
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TABLE Ila. Dmmett multiple 



Comparison test 


Population pairs 


Significance CP) 


Colon cancer group taken as reference 




Colon cancer vs. control 


p<om 


Colon cancer vs. diverticulosis/itis 


P<0M 


Colon cancer vs. IBD 


P<D.01 


Colon cancer vs. colon polyps 


p<o.oi 


Colon cancer vs. various cancer types 


F<D-01 


Colon cancer vs. healthy patients after 


; P<0M 


colon cancer ■ 




Colon cancer vs. inflammatory disease 


P<0M 



colon cancer patients and each of the other 
patients groups (Tables Ila and lib). 

The receiver operating characteristic curves 
for CCSA-2 are shown in Figure 1A,B. The 
CCSA-2 assay was highly accurate in separat- 
ing colon cancer from healthy control (area 
under the curve 0.94, 95% confidence interval, 
CI, 0,89-0.99; Table III). Additionally the 
ROC curve was highly accurate in separating 
colon cancer from healthy control and all 
other patients population (area under the curve 
0.8938 95% CI, 0.83-0.94). Using the ROC 
curve the cut off level of 0.6 OD was selected 

(Fig. 2). : 

Using a cutoff value from 0.6 OD the sensi- 
tivity was 88.8%; . 24 from 27 colon cancer 
patients are detectable in serum with CCSA-2 
and the specificity was 92.5%, 37 healthy 
individuals from 40 were identified with the 
assay as correctly negative. The overall sped-. 



ficity was 84.2%, 106 of the 127 individuals were 
diagnoses as normal were below the cut off 
(Table IV; Fig. 3). 

DISCUSSION 

The early diagnosis of CRC and the early 
detection of recurrence are central to the 
effective treatment of this disease. There is a 
consensus that CRC screening is effective and it 
can be prevented in many cases. Due to CRC 
screening the incidence of CRC has dropped in 
recent years, possibly due to the screening 
program [Mandel, 2005]. There is less con- 
sensus regarding optimal screening strategies, 
as sensitivity and specificity, and patient 
acceptance, limit current options. To overcome 

. these barriers a range of approaches, including 
proteomics based testing, stool genetic testing, 
radiological imaging, and enhanced endo- 
scopies has been the focus of intense research. 
..Presently, colonoscopy with a sensitivity of 
97% and a specificity of 98% and a sensitivity of 
adenomas of at least 1 cm diameter of around 
90% [Pickhardt et ah, 2003; Winawer et ah, 
2003], is considered the gold standard for colon 

: cancer diagnosis and offers the potential to both, 
find and remove premalignant lesions, but it is 
associated with high cost, patient discomfort, 
complication, and variable sensitivity given 
through the experience of the endoscopies. 

A useful diagnostic assay must be sensitive 
and must detect the cancer in an early tumor 



TABLE lib. Summary of Data 









Standard 


Standard 




Group 


: N 


. . Mean . 


deviation 


error of mean 


Median 


Colon cancer 


27 


>0.7359 


>0.I576 


>0 k 03034 


>0.6900 


Control 


40 


>0.5324 v 


>0.06086 


>0.009624 


>0.5300 


IBD 


H 


>0.5390. 


>0,06725 


>0.02028 


>0.5660 


Colon polyps 


20 


>0.5686. 


>0.1329 


>0.02971 


>0.5400 


Diverticulosis/itis 


21 


>0£529 


>0.09961 


>0.02174 


>0.5500 


Other cancer type 


37. 


>0,5687 


>0.1445 


>0.02375 


>0.545O 


After colon cancer 


9 


>0,4882 


>0.05895 


>0.01965 


>0,4860 


Inflammatory disease 


9 


>0.5573 


>0.09594 


>0.03198 


>0,5340 



Group 


Minimum 


Maximum 


95% Confidence interval 








From 


To 


Colon cancer 


>0.5480 


. >1.250 


>0.6735 


0.7983 


Control 


>0,3990 


>0.7200 ■ 


>0,5X29 


0.5518 


IBD 


>0.4390 


>0.6470 


>0.4938 


• 0.5S41 


Colon polyps 


>0.340O 


>0.8480 


>0.5064 


•: 0.6308 


Diverticulosis/itis 


>0,4000 


>0.770O. 


>0.5075 . 


0,5982 


Other cancer type 


>0.3688 


V >I.03l 


>0.5204 


0.6169 


After colon cancer 


>0.3920 


. >0.5690 


>0.4429 


0.5335 


Inflammatory diseas 


e >0>4390 


>0.7300 


>0.4836 


0.6311 
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Ftg. 1. A: Receiver-operator characteristic (ROC) curve for CCSA-2 in separating normal healthy patients 
and colon cancer patients. AUC: area under the ROC curve. B: Receiver-operator characteristic (ROC) curve 
for CCSA-2 in separating colon cancer patients from ail other patients including healthy controls, ■ : ; \ ; 



stage/ Also it must have a high specificity, 
to minimize false positives that necessitate cost 
or invasive examination and additional scares 
the patient and the! families needlessly 
[Ahlquist, 1997]. That one biomarker will 
accomplish all these criteria will be almost 
impossible, but the combination, of specific 



TABLE BEL Area Under the ROC Curve 

1. HOC analyses for CCSA-2 in separating control individuals 

(normal colon) from colon cancer patients 
Area ' • 0.9394 
Std. error 0.02751 
M% Confidence interval ■ 0.8854-0.9933 

P-value <0.0001 
Data 

Control 40 
Colon cancer patient 21 . 

2. ROC analyses for CCSA-2 in separating control individuals 

(normal colon) and all other patients from colon cancer 
patients 

Area '- 0.8938 

• Std. error • ■ 0,02831 

. 95% Confidence interval ■ 0.8384-0.9493 
P-value /' :•■ <O.O00J 

Data ■ ■■" 

Control and all other patients - " ■-. .. 127 : 

... Colon cancer patient • .. . 27 



markers could have the possibility to meet the 
condition for a useful screening test in CRC. 
,\. This study shows that the ELISA, that detects 
serum based CCSA-2, is both sensitive and 
specific for colon cancer. In addition, this is the 
first time, that CCSA-2 has been detected in the 
serum from patients with advanced adenomas, 
confirming tissue data we could found in 
previous studies in colon polyps [Brunage! 
etaL, 2004]. The serum based ELISA with 
CCSA-2 antibody demonstrated a sensitivity of 
88.8% and considering the entire study popula- 
tion, a specificity of 84,2%. : 

Three of the colon cancer patients, two had a 
tumor stage UICC II and one patient UICC III 
were under the cut off point and therefore 
considered to be negative for CCSA-2. So far, 
we have no explanation, why these patients do 
not appear to express CCSA-2 in the serum. 
Previously studies have shown that CCSA-2 is 
expressed in 80% of colon cancer tissues (IS). 
With the presumption that not all colon cancers 
may be express the NMP CCSA-2, we under- 
stood the limitations and the development of 
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Fig, 2. Serum analysis of CCSA-2 in colon cancer and control. 
Using the ROC curve a cut off represented by red Ime of 
0 6 OD, results in the optimal balance between sensitivity and 
specificity, ' , '. . ' 



additional serum marker based on the other 
identified NMPs CCSA-3,4,5 could close this ■ 
gap [Leman et aL, 2007]. 

Based on evidence from epidemiological 
and pathological studies, most sporadic colon 
cancers are thought to develop from benign 
adenomas. Presently, there is no clear way 
of identifying which adenomas will become 
malignant. There is consent, that progression 
; is associated with severe dysplasia, patient age; 
size of adenoma, and histological types [O'Brien 
et aL, 1990], Adenomas that are >1 cm, show 
severe dysplasia and/or villous architecture are 
described as advanced. 

In previous studies, we demonstrated the 
expression of CCSA-2 in advanced polyps 
[Brunagel et aL, 2004], Four serum levels from 
patients with colon polyps are above the cut off 
point, three of which have advanced adenomas; 

Three normal individuals showed an in- 
creased level of CCSA-2 in there serum. Regard- 
ing the colonoscopy report, the examination was 
not difficult and the colon clean. Reviewing the 
literature 4% of polyps or carcinoma are over- 
seen in a colonoscopy especially in the right 
colon [Bressler et aL, 2004], In these cases and 
additionally in the cases with diverticulosis 
and IBD we can just speculate, if there was 



something overseen. However, especially in 
cases where colonoscopy is difficult, a serum 
marker, which could detect early colon cancer 
and furthermore advanced adenomas, would be 
very helpful. 

Regarding the other cancers types, 9 p atients 
out of 37 have an expression of CCSA-2 in the 
serum above the cut off point Three patients 
with cholangiocarcinoma (3/6), one patient with 
lung cancer (1/4), four patients with gastric 
.'■ cancer (4/13), and one patient with hepato- 
cellular carcinoma ( 1/3) . None of the 1 1 patients 
with pancreatic cancer had an expression 
of CCSA-2 above the cut off point. There is 
no correlation of the tumor stage and the 
CCSA-2 expression (correlation coefficient 
(r) - -0.1687, r squared = 0,02847), 

So far we have no explanation for the 
expression of CC2 in other cancer types. 

To evaluate the effect of the removal of the 
colon cancer by surgery on the serum CCSA-2 
value, samples were obtained from nine 
patients after colon cancer surgery 2-9 years 
after curative surgery. All nine individuals 
considered to be normal after curative colon 
cancer surgery, 

. Additionally, patients with benign inflam- 
matory disease like pancreatitis and gastritis 
and diverticulitis and patients with IBD 
were studied. One patient with IBD (1/11), 
four patients with diverticulosis (4/21), and 
two patients with benign inflammatory disease 
(2/9) had CCSA-2 values above the cut off point, 
We could not observe a correlation between the 
elevated CCSA-2 levels and the grade of the 
inflammation. 

. Further studies are needed to examine the 
expression on CCSA-2 in other disease. Never- 
theless the overall specificity of CCSA-2 is 
84.2%, shown it is a specific marker for colon 
cancer. This is the first study demonstrating the 
ability of CCSA-2 antibodies to specifically 
identify colon cancer patients in a clinically 
applicable test. However, clinical trials need to 
be performed, for evaluation of the sensitivity 



TABL E IV. Specificity/Sensitivity of Blood CCSA-2 Assay 

; ; No. of samples <0.6 OD/total no. 

samples Specificity % 

Donors >■-. >37/40 >92,5 

All populations , >108/J27 >84,2 

No. of samples >0.6 OD/total no. samples Sensitivity % 

Colon cancer >24/27 >88.8 
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Fig. 3. Serum analysis of CCS A- 2 in study populations. Total of 1 74 serum samples screened for CCSA-2 in 
indirect EUSA. Cut off value of 0.6 OO. Represented by red line across graph. Line in between the patients 
groups represents the median value. of the group. . ; . .. 



and specificity in independent validation stu- 
dies in a larger population of patients. 
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EXHIBITS 



KIAA1199 



Analysis of three sequences which correspond to regions of KIAA1199 confirm that 
KIAA 1 199 is differentially expressed (higher) in adenoma tissue compared to normal tissues, 
Expression patterns for the three KIAA1199 sequences are compared in Table 1 and shown in 
Figures 1, 2 and 3, The results have been obtained by analysing 19 adenoma patients and 30 
non-diseased controls. 



Table 1: Comparison of expression patterns for KIAA 11 99 between adenoma and tissues 
and non-neoplastic tissues. 
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Figure 1. Sequence ID 7 (KIAA1 199) 



o 



ID 



q 



to 



o 

CO 



to 




Norma! 



Figure 2: Sequence ID 103 (KIAA1199) 




Figure 3: Sequence ID 316 (KIAA1199) 
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Abstract 

Colorectal cancers are believed to arise predominantly 
from adenomas. Although these precancerous iesfons 
have been subjected to extensive clinical, pathologic, 
and molecular analyses, little is currently known about 
the global gene expression changes accompanying thefr 
: formation. To- characterize the molecular processes . 
underlying the transformation of normal colonic 
epithelium, we compared the transcriptomes of 32 
prospectively collected adenomas with those of 
normal mucosa from the same individuals. Important 
differences emerged not only between the expression 
profiles of normal and adenomatous tissues but also 
between those of small and large adenomas, A key 
feature of the transformation process was the 
remodeling of the Wnt pathway reflected in patent 
overexpresslon and underexpression of 78 known 
components of this signaling cascade* The expression 
of 19 Wnt targets was closely correlated with clear 
up- regulation of K1AA1199, whose function is currently 
unknown, In normal mucosa, KIAA1199 expression 
was confined to cells in the lower portion of intestinal 
crypts, where Wnt signaling is physiologically active, 
but It was markedly Increased in all adenomas, where it 
was expressed in most of the epithelial cells, and in 
colon cancer cell lines, it was markedly reduced by 
inactivation of the B catenin^cell factors) transcription 
complex, the pivotal mediator of Wnt signaling. Our 



ItarivBti 6/9/07; noised 7/28/07; accepted 8/2/07, 

Grarvt support: Zurlcft Cancer League and Swiss National Scionce FoundaUor), 
The co$*$ of publication of titii anicte were defrayed fn part by the payment of 
page charges. Tfiis article must Iterator* be hereby marked advetifcerneiu in 
accordance witti 18 U.SX, Section 1734 solely 1o indicate thfe fact 
Mote: Supplementary daia for this article are ova; fable «s Molecule/ Cancer 
Research Online {tittp://mcr,s3crjoufr»ts»org/), 
I. SabMes-BeUvw and LG. Van der Flier cortfribuied equally to this work. 
Requests for reprints: Gtencarlo Marra, Institute of Molecular Cancer Research, 
University of Zurtcn, Wimerthurerstrasse 190, 80S? Zurich, Switzerland. ■ . 
Plane: 41-04**3Sh347£ Fax: 41-044*354484. E-maHr marra@imcr.u2rtch . 
Copyright D 2007 American Association for Cancer Research. 
eV>i:10.nS8/154Vn86.MCR*D7-02e7 ■"; 



iraoscriptomic profiles of normal colonic mucosa and 
colorectal adenomas shed new light on the early stages 
of colorectal tumorigenesls and Identified Kl A M 199 as a 
novel target of the Wnt signaling pathway and a putative 
marker of colorectal adenomatous transformation. 
(Mol Cancer Res 20G7;S(12):1263-75) 

Introduction " 

In developed countries, sporadic adenomatous colorectal 
polyps are found in roughly one third of asymptomatic sdults 
below the age of 50 who undergo colonoscopy. Depending on 
: their characteristics (multiplicity/ size, histologic features, and 
degree of dysplasia), these lesions can be associated with a 
substantial risk of recurrence (up to 60% at 3 years) and the 
development of advanced neoplastic disease (reviewed in ref. 1 
and references therein). It has been estimated that 15% of all 
adenomas measuring z1 cm will progress to carcinomas within 
10 years of their detection (2). 

Although adenomatous polyps are not the only precancerous 
lesions in the colorectum, they are She most common, and they 
are ihe precursors of most of the cancers in this organ, In these 
neoplasms, the transformation process begins in the epithelial 
crypts and seems to result from qualitative, quantitative, and 
spatial subversion of the Wnt signaling pathway the physio- 
logic regulator of epithelial homeostasis (3-5). This adenoma- 
carcinoma pathway of turnori genesis is characterized by 
mutations involving various components of this pathway 
(e.g., APC, whose germ-line mutations are responsible for 
familial adenomatous polyposis; CTNNB1, which encodes a 
subumt of the cadherin protein complex known as h-catenin; 
and Axin, the gene encoding a rnultidomain scaffold protein 
that is essential for h-catenin degradation). The result of these 
mutations Is an accumulation of h-catenin, first in the 
cytoplasm and then In the nucleus, where It associates with 
DMA-binding proteins of the T-cell factor (TCF)/lymphoid 
enhancer factor family, transforming them from transcriptional 
repressors into transcriptional activators that affect the expres- 
sion of numerous genes involved In epithelial homeostasis. 

Although the key role played by adenomatous polyps in 
colorectal tumorigenesis is widely acknowledged, the gene 
expression changes that trigger or accompany their development 
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Iiave never been comprehensively studied. We therefore 
conducted a traracripiomic analysis of prospectively collected 
colorectal adenomas using a standardized oligonucleotide 
microarray covering the entire human genome. This sludy not 
only provided new information that is fundamental for Mure 
molecular characterization of these precancerous lesions but 
also allowed us to identify a putative marker of colorectal 
tumorigenesis. 



Results 

The focus of our study was the adenoma*arienocarcinorna 
pathway of colorectal carcinogenesis, which is closely linked lo 
deregulation of the Wnt signaling pathway. To gain insight into 
the early steps of Shis process, we confined our investigation 
exclusively to sporadic, pedunculated colorectal adenomas 
(type Q-lp of the Paris classification; rei 6). Nonpolypoid and 
sessile polypoid lesions were not included because in some 
cases their transformation is believed to proceed along 
nonadenomatous pathways (7)/ Details on our case selection 
criteria are provided in Materials and Methods. 



Thirty-two pedunculated adenomatous polyps/ each with 
matched samples of normal mucosa, were prospectively 
collected from 28 patients (Tabfe 1)* The total number of 
synchronous and previously excised adenomas was <3 in IB of 
28 patients and 3 to 15 in the remaining 10. In this latter 
subgroup, the absence of APO or MVH -associated multiple 
adenomatosis had been confirmed by genetic testing of 
lymphocyte DNA. Histologic analysis of one polyp (case 
NM) revealed superficial Infiltration of the submucosa, but this 
case was not excluded because the region sampled for 
microarray analysis was clearly adenomatous. (As noted below, 
this finding was consistent with the results of hierarchical 
cluster analysis shown in Supplementary Fig. SI). 

Analysis of microarray data for the 32 adenoma/normal 
mucosa tissue pairs revealed that 31,033 of the probes were 
expressed in one or both of the lissue groups. The normal 
tissues were effectively segregated from the adenomas in four 
unsupervised analyses of the expression levels of these genes 
[hierarchical clustering, principal component analysis (PCA), 
correlation analysis, and correspondence analysis (CA); see 
Materials and Methods for details; Fig/ 1J, In a separate 



Table 1 Characteristics 'of the 2& Patients with Adenomatous Polyps Included in the Study 
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Abbreviations: M, male; female; A, asccndmg cofon; T r traftsversurn; D, dBSCendlng colon: $, sigmoid colon; R. rectum; T. lubuter; T-V, tubujoviiluvs; L, low-gratTe 
dyspla^ra; H, higlt-gratlc dyspl&sia. 

♦LoiAr-grade versus htgh-gradB dysplasia as defined by the WHO classification of tumors of the digeslive system, editorial and consensus conference in Lyon, France, 
November 6-9, 1999. !ARC. 

cThls number includes the acfenorosfe) sheeted to microatray analyst 
t> Total number of adenomas deleted and excised during previous cot on ascites, 
kTwo adenomas from these patiems were analyzed. 
kThese cases were considefed as recurreni adenomas for the CCA, 

{ The inde* colnnoscopy vras done In a different center about 10 y before the study colonoscopy, 
•"Hyperplastic polyposis. 
ccMo previous colonoscopies. 

bbSuperficTal submucosal Invasion fTl). Th« tissue collected for microarray came Ubm the adenomatous portion of the polyp. 
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analysis, these two tissue groups were also unequivocally 
distinguished from a previously described set of 25 colon 
cancers (8), which we reanalyzed for this study with the same 
microarray used to characterize the adenomas and normal 
mucosa (Supplementary Fig, SI), 

Almost half of the expressed probes (15,059 of 31,033) 
displayed significant expression changes in adenomas. Those 
with fold changes ZZ (1,190 probes up-regulated and 2,469 
down-regulated In adenomas) were subjected to gene ontology 
analysis to identify the biological processes Involved in the 
transition of norma! mucosa to. adenoma, The most significant 
results of this analysis are listed in Supplementary Table SV 
The processes that were most markedly overrepresented among 
genes that were up~regulated in adenomas included mitosis, 
DMA rep ii cation, and spindle organization. Down-regulated 
genes were predominantly involved in host immune defense, 
Inorganic anion transport, organ development, and inflamma- 
tory response, although a small group of genes involved in the ... 
latter process was up-regulated in adenomas (Supplementary 
Fig. 52}. 

We then analyzed the transcript levels of 319 genes believed 
to be components of the complex Witt signaling pathway 
(Supplementary Table S2), Sixty-six of these genes (21%) were 
not expressed In either the normal or adenomatous tissue, 
and 34% wBre expressed similarly In both tissue groups. The 
remaining 144 genes displayed significantly altered expression ; 
in adenomas, and 78 of 144 displayed fold changes of z2. 

A supervised extension of CA (9), canonical CA (CCA), 
was then used to Identify possible correlations between gene 
expression patterns and clinical or pathologic variables. Four 
of the variables considered (adenoma diameter, colon segment 
of origin, degree of dysplasia, and adenoma recurrence; see 
Table, 1) were clearly associated with distinct clusters of 
expression profiles (Fig, 2, variables in A and clusters for 
adenoma diameter in B; more details in the legend to this 
figure). The profile of adenomas measuring >20 mm could be 
easily distinguished from those of smaller (V2Q mm) adenomas. 
As shown by CCA and visualized on the corresponding CCA 
score plot (Fig. 2B), the centers of the three adenoma size 
clusters are distributed along the principal CCA axis (the 
vertical axis in Fig, 28, the most important axis of separation of 
the expression profiles) in a definite order, with increasing 
diameters corresponding to progressively higher CCA scores. 
The variable large adenoma diameter was closely correlated 
with the vertical CCA axis (U„ its vector "d>20rnm" in 
Fig. 2 A is almost parallel to this axis), It is interesting to note 
that the same correlation can be observed for Ihe variable high- 
degree dysplasia (i,e., represented in Fig. 2A by vector "Hd")* 
This finding confirms the expected correlation between larger 
diameters and hrgher-degree dysplasia. 

The CCA plot of the 11,70$ modeled probes (loading plot, 
not shown) suggested that the distinction between the three size 
groups of adenomas is due to a complex network of relatively 
small changes In the expression of numerous genes (as opposed 
to marked changes involving a limited number of genes). 
Nevertheless, to maximize the use of the extensive data sets, we 
selected the 500 probes with the highest loading scores along 
the CCA axis 1 and isolated a set of genes whose expression 
changes displayed significant positive or negative correlation 



wish adenoma size (Supplementary Table S3), Although their 
association with adenomas must be validated in a larger series, 
these are Ihe expression changes most likely to pfay causal roles 
in the progression of these tumors. 

It should be mentioned that normal mucosa from the sigmoid 
colon had an expression profile that differed significantly from 
that of tissues from other colon segments (Fig. 2A), This 
finding will be explored in a Mure study conducted on e large 
series of normal mucosa samples from different colorectal 
segments. 

The transcriptional profile of the 32 adenomas was 
thoroughly analyzed to identify genes likely to be involved 
in the development and evolution of these lesions. One of the 
first features that attracted our attention was the marked up- 
regulation of K1AA1193 (Supplementary Table S4), a gene 
encoding a protein with unknown function. Its overexpression 
was striking in ail colorectal adenomas we examined (average 
increases of 54.8-fo!d compared with normal mucosa) and in 
carcinomas (8). These findings were fully confirmed by real- 
time reverse iranscrlptlon-PCR analysis of RNA extracted from 
samples used for the microarray study and from additional 
samples collected after the present study was completed 
(Supplementary Fig, S3). 

In light of these findings, it was natural to wonder whether 
KIAA1199 might be a novel positively regulated target of Wnt 
signaling, which is characteristically deregulated in colorectal 
tumors. Previous microarray studies indicated that genes 
coregulatcd at the transcriptional level under different con- 
ditions tend to be involved in the same processes and pathways, 
and the analysis of transcriptional coexpressiori has been used 
to predict the function of novel genes (10*12), Therefore, we 
conducted a search for known Wnt targets (listed in 
Supplementary Table 55) among the genes whose expression 
patterns in all the tissue samples significantly correlated with 
those of KIAA1199. (The procedure used in this analysis is 
summarized in Materials and Methods and Supplementary 
Fig. S4J Forty-nine percent of the known Wnt targets that were 
overexpressed in our adenoma samples had expression patterns 
that were positively correlated with that of KIAA1199 (Fig. 3A 
and B) as opposed to only 7.9% of the overexpressed genes that 
are not considered Wnt targets (P < 0*0001), 

Evidence of the potential Involvement of KIAA1199 in the 
Wnt signaling pathway had also emerged from another study by 
our group (13). A combined analysis of microarray data of 
tissues and cell lines placed KIAA1199 at the top of a list of 
genes {Supplementary Table S1 of ref, 13] that were up- 
regulated in colorectal adenomas and down-regulated in colon 
cancer eel! lines that had undergone stable transfection with 
doxycycline-mducibte forms of dominant-negative TCF1 or 
TCF4 to suppress Wnt signaling (14, 15). tn the present study, 
KIAA1199 was also found to be markedly down-regulated in 
LS174T colon cancer cells in which Wnt signaling had been 
blocked by the induction of h-catemn small interfering RN A or 
NH r terminai~de!eted TCF4 (1 5, 1 8). The dramatic decrease in 
K1AA1199 rnRNA levels associated with this inhibition of the 
Wnt pathway was confirmed by Northern blotting (Fig, 3C}, 

In general, Wnt target genes are expressed predominantly in 
the proliferating compartment of nomnaMntestinal crypts (lower 
portion), and their expression is appreciably increased in 
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FIGURE 2. Ciintcal/palhoto9iCU3nablBs that correlate with distinct gene expression profiles. The panels summarize the most Important results of the CCA 
of mRNA intensity log-ratio values (adenoma; normal) of expressed genes. For 013% CCA axis 1 has been drawn vertically in both panels. A. Correlation 
between specific clinical/pathologic variables (adenoma diameter, colon segment of origin, degree of dysplasia, and adenoma recurrence) end clusters of 
differential gene expression profiles {coded as log-ratio proffies), such as those shown inB. £ech vector represents a specific value for a given variable (e.g., 
adenoma diameter of >20 mm and high-degree dysplasia) and points toward the center of the profile cluster correlated wilh the clintcst/palhotogfc 
characteristic it represents. If the centers for each specific value are separated, the corresponding vectors point m distinct directions; olherwise, they are 
directed toward the same point. In the former case, the represented variable can be assumed to be significantly combated with the profiles; in the latter case, 
there is no correlation. Tne length of the vector reflects the strength of the correlation: those approaching the circumference of the correlation circle, which 
represents a correlation value of 1, indicate stronger correlation than shorter vectors (correlation closer to 0). d t diameter; Hd, high-degree dysplasia; Lo\ low* 
decree dysplasia; A, ascending colon; T, transverse colon; Q, descending coton; 5, sigmoid colon; R. rectum; Rec, recurrent adenomas; no Kec, no recurrent 
adenomas. Unlabeled vectors are related to variables that were not clearly associated with any distinct cluster of expression profiles. Larger adenomas were . 
predictably associated vwth high-degree dysplasia. In contrast, their association with nonrecurrence was unexpected and probably due to the fact that 
patients who had already undergone endoscopic polypectomy 0 ,e„ those with recurrence) presented relatively recent-onset (consequently, smaller) polyps at 
the study colonoscopy, B« CCA score plot with samples grouped by adenoma diameter. Each of the three size^eialed groups is delimited by an ellipse with 
the center labeled, Tne ellipse representing the adenomas measuring >20 mm in diameter shows very Utile overlap with those of the other two groups 
(adenomas with diameters of 20 mm and tnose with diameters of <20 mm). 

patterns were confirmed at the protein level by immunohisto- 
chemistry done with an antibody raised in our laboratory 
(Fig, 4D-J). This analysis also revealed thai the KIA A1 1 99 is 3 
cytoplasmic protein whose expression is most intense near the 
cell membrane, particularly on the luminal side of the tfyspIasUc 
cell multilayer {Fig. 4F^J). 



FfGURE 1 . Unsupervised analyses of nricroarray data; A» Hierarchical clustering analysis. The 64 tissue samples represented on the X axis include 32 
normal mucosal samples {green branches) and 32 adenomas (red branches). Each probe plotted on the Y axis is color coded to indicate the level or 
expression of the gene relative to its median expression level across the entire tissue sample set (blue, few; red, high). In the adenoma dendrogram, 
branches representing Individual samples and small groups merge at higher IgvbIs than those of the normal mucosa dendrogram, reflecting tower-level 
correlation (i.e., higher variability among the adenoma specimens). B* PC A. Profile plot of the nor matfzed first principal component {PCA1) across the 54 
specimens (green dots, normal mucosa; red dots, adenomas). The two tissue groups differ significantly in terms of PCA1 (P < 0*0001), which accounted for 
2&% of the total variance. Mote the nigher variabaity of the PCA1 values fn the adenoma group (higher fluctuation), C. Correlation analysis. "Trie plot 
visualization of the pairwise conelaOons of the samples. Correlation va lues are indicated on the grayscale column (white > black; high > low). High correlation 
is observed among the samples within each group (top right quadrant, adenomas; bottom left quadrant, normal mucosa), although the adenomas displayed 
somewhat greater diversity (U„ on the whole, the gray tones m the lop right quadrant are darker than those in the bottom left quadrant). Top left and bottom 
right quadrants, normal and adenoma samples are poorly correlated. However, samples from the same patient generally showed higtter correlation than that 
observed between normal and adenoma samples from different patients (brightpfxels on the secondary diagonals in the lop left and bottom right quadrants). 
This finding probably reflects the strong influence of several factors, including the individual genetic background and lifestyle and the fad that the normal and 
adenomatous tissues from a given patient were from the same colon segment D. CA of mRNA k»g{inten&ily) values of expressed genes from 27 of the 32 
tissue pairs {green dots , normal mucosa; red dots, adenoma). The other five pairs were excluded from this analysis because one of the two samples behaved 
as an outlier, limiting our analysis to the more homogeneous pairs facilitated the comparison of the gene expression profiles For the two tissue groups and 
allowed more reliable identification of clinM/pathofogic variables associated with profile scatter (see Fig. 2). The areas delimited by the ellipses represent 
95% of the estimated binormat distribution of the sample scores on the first and second CAa>ces. The map of the sample scores on the first two axes shows . 
that CA efficiently discriminates between normal and adenoma samples. Higher variability is evident in the adenoma group, where the samples are more, 
widely dispersed. . ; • ;' ; ;; . " : V-' •., . 



adenomatous glands (15). Our analysis of human tissues with 
preserved architecture indicated that these are also attributes . 
of KIAA1199, in in situ hybridization studteSr KlAAl 199 
mRNA was detectable only in the lower portion of normal ..' 
colonic epithelial crypts (Fig, 4A and B}, and its expression 
levels were much higher in dysplastic glands (Fig. 4C). These 
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Discussion 

Adenomatous colorectal polyps are one of the most common 
human tumors and the most frequent precancerous lesions in 
the colorectum, but their trenscriptome has been only partially 
analyzed, and the data are generally based on a limited number 
of cases (17-20). We attempted to fill this gap by doing a 
comprehensive whole-genome rmcroarray analysis of a large, 
highly, homogenous set of adenomas that was collected 
prospectively,. : • 

A comparison of the transcriptomes of adenomatous polyps 
and segment-matched samples of normal colorectal mucosa 



revealed evidence of broad-scale remodeling. As a starting 
point for future verification studies, we have drawn up a list of 
478 genes that were significantly up-regulated (n ■■ 153) or 
down regulated (n - 325) in the adenomatous tissues (fold 
changes of z4; Supplementary Table S4). Space constraints 
preclude more than a cursory examination of ihis list, but we 
hava highlighted in Table 2 certain aspects that we feel are 
particularly interesting in terms oF their relevance to the process 
of adenoma formation. For instance, transcription regulation 
seems to be extensively modified. Twenty-nine molecules 
involved in Shis process were expressed in adenomas at levels 
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FIGURE 3. KIM1199 is a 
putative target of Wnt signal- 
ing* A, Degree of correlation 
between the expression of 

. KIAAH99 mRNA and that of 
19 known Writ signaling target 
genes Identified with the pro* 

. cedure described in Materials 
and Methods, Results, and 
Supplementary Fig, 54. For 
each of the 20 genes, the 
graph shows tne normalized ."■ 
intensity of ex pression level 
(ploued on the V axis) in each 
of the 32 a denomas and 
corresponding samples of nor- 
mal mucosa (X axis). B. Mean 
expression of each gene in 
normal mucosa (green dots) 
and adenomas (red dots). 
Bars, confidence interval C, 
Northern blot showing reduced 
KIAA11S9 expression In 
LS174T cells following doxy- 
cycfine*medfated induction of 
h-catenin small interfering 
RNA, dominant-negative 
TCF4 (dnTCF4), or NH S * 
terminal-deleted TCF4 (Nk 
TCF4). The f8-kb band 
corresponds to full-length 
K1AA1199 mRNA. The lower 
band (f 5 kb) may represent 
an alternative form of this 
mRNA, Dox, cell transfectartts 
grown in the presence or 
absence of doxycyciine; Tr1, 
a parental clona p.e; cells 
expressing the repressor pro- 
tein modified by doxycycline 
hut not transacted with h- 
CBtenbsrcatl interfering RNA, 
dominant-negative: TCF4, or 
NH r terminat-deleted TCF4) 
used as a control of doxycycBne 
exposure. Bottom, eihidium = 
bromide- slainad agarose gel 
as a loading control. 
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FIGURE 4. Expression of 
KJM1199 mJ?NA and protein 
in normal intestinal mucosa 
and colorectal tumors, in situ 
hybri dization studies (A*C) 
localized KIM1199 mRNA ox- 
pre&sion to the lower portion of 
normal epithelial crypts (A end 
8) and revealed that expres- 
sion is markedly up-regulated 
in colorectal tumors (C). Aster- 
isk, note me different levels of 
expression in tumor glands ®n& 
normal crypts. D. KIAA119& 
protein expression is also limit- 
ed to the lower half of the 
normal colonic crypts, and a 
similar pattern is observed in 

' the ileal mucosa {El where the 
protein is expressed only in the 
crypts (not in the villi), In F 
and G, adenomatous crypts 
with taw>gratfe dysplasia pres- 
ent increased expression G f 
KIAA1199, particularly En the 
cytoplasm facing the crypt lu- 

. men, and in and around the 
mucin vacuoles of goblet cells 

.. (note fiie striking difference with 
goblet cells of normal crypts in 
both panels). The expression 
pattern ■ changes dramatically 
during the transition from low- 
grade dysplasia with goblet cell 
differentiation *,H) to high-grade 
dysplasia In which this differen- 
tiation is no longer apparent. 
J. In more advanced colon 
tumors, KIAA1199 overexpres- 
sionis maintained. Note thai, 
in \ and J, the expression of 
K1AA1199 protein (like that of 
KIAA1199 mRNA; C) Is highest 
in the luminal portion of the 
dysplastic gtends (arrowheads, 
■multilayer of unstained nuclei 

' occupying more than the basal 
half of the dysplastic epitheli- 
um). K. Norma* mucosa, with 
the corresponding tumor in 
the inset Negative control: 
K1AA1199 antibody preab* 
sorbed with the peptide used 
to immunize rabbits. 
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>4 higher or lower than those observed in the normal mucosa, 
but there were also several smaller changes in this category 
(Supplementary Table S6) that might also have dramatic effects 
on gene expression. Several other alterations reported in Table 2 
are noteworthy in terms of their potential effect on cell 
proliferation, differentiation, apoptosis, and cell adhesion; (a) 
up-regulation of four members of the REG (regenerating) 
family of genes (21, 22), which would lead to increased tissue 
mitogen expression; (b) tip-regulation of LCN2 (23) and down- 
regulation of ZFHX1B/S1P-1 (24) tn the absence of significant 
changes in the expression of the epithelial cadherln CDH1 
(E-cadherin), which would prevent or delay the epithelial 



mesenchymal transition {changes were also noted in the 
expression of other cell adhesion genes of the cadhertn and 
claudin families, including the striking over express Ion of the 
placental cadherln gene CDH3 r which is associated with early 
events in the transformation process (25, 26)]; (c) down- 
regulation of ZFHX1B/SIP-1 and Max dimerrzation protein 1 
(MXD1/MAD1; decreased only 3,3-fold and therefore not listed 
in Table 1\ refs, 27, 28) and overexpression of the RTEL1 
helicase* which should facilitate telomere elongation {29); (d) 
alterations that would diminish apoptosts [e.g., overexpression 
of the decoy receptor for Fas iigand, TNFRSF6B, which is 
reportedly coregulated with RTEL1 on chromosome 20q13,3 
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Table 2. Genes Nfcst Likely to be Involved in the Development and Evolution of Colorectal Adenomas. (A Subset of Genes 
Listed in Supplementary Table S4) Subdivided by Gene Ontology Category 



Gene symbol Gene name . _ Fold dhTe renews* 



Regulation of transcription 

hfLFI telear iocaJized faaof 1 ! '" ■ l " 33.1 ' " 

F0XQ1 . .. Forkhcad box Q1 24.4 ... 

MSX2 . M&h homeobox homologue 2 22.2 

ASCL2 • Achaete-scwite . complex-like 2 17.3 • 

MSX1 • . Msli homeobox hom&Iogue 1 8.5 

Iroquois homeobox protein 3 6.4 

GRHL3 . Grainyheatf-tike 3 7,9 , 

TRIM29 Tripartite motif-containing 29 • .. , 7*4 

ETV4 ;■' Ets variant gene. 4 (E1A enhancer binding protein, E1AF) 5-4 

ARNTL2 Aryl hydrocarbon receptor nuclear trunslocator-iike 2 M .. 

TEAD4 . TEA damam family member 4 $.2 

SF5 . Sp5 trarttcriprton factor . 5.2 

HE56 Hairy and enhancer of split 6 .. 4,6 

" TBX3 T-box3 4.6 ■ 

NFE2L3 Nuclear factor ferythroid-tfetived 2)-lrke 3 ". :■■ -V ■' 4.3 

GRHU . Gra^yhead-iike 1 . ; , ^ :. 

FEV FEV (ETS oncogene family) '>.-;• 15,1; 

SPIB . Spi-B transcription factor 132 

NEUROD1 Neurotonic differentiation 1 • 10.6 

M£IS1 ■ Mebl^ myeloid ecoiropit viral integ^aMon site 1 7.1 

NR3C1 Nuclear retepior sublamiSy 3, group C, member 1 . ' 5,9 

•. Nuctear receptor subfamily S, group A. member 2 -5.5 

- . THRB Thyroid hormone roccplor, It '• 5.2 •' 

2NF4B3 . :-. Ztoc finger prolein 4B3 ■ 5.1 

ZFHX1IJ . Zinc finger homeobox 1b (SIM) ' 4,0 .. 

MEOX2 Mesenchyme homeobox 2 . . 4*7 

HOXD10 .. Homeobox D10 . \ 4,6 ■■■ 

MAF . vmaF musculoaponeurotsc fibrosarcoma oncogene -.- ■'. 4.5 

• . 5QX10 SRY {sex determining region Y)-box 10 . • 4.2 
Cetf proiiferaiion^ifferantiation/apoptosis 

REG1B • Regenerating istt*dcrived 1h . 75.8 . - : 

. REG3A ; Regenerating fclc-t-deritfed 3a 29.5 

TACSTD2 . Tumor-associated calcium signal transducer 2 21.4 •'• 

. tL-S . Interleukin-8 . 14.7 

SERPIN85. Serpm peptidase inhibitor, clade B, member 5 (Mospin) ' 13,6 

REG1A .. Regenerating i$*er.-deJivod la 8.2 ' 

FA1NS2 • Fas apoptoUc inhibitory molecule 2 7.S • 

. DUSP4 . . .Dual specificity phosphatase 4 ■ : . f " .' '7.4 

REG4 •„ Regenerating islet -derived family, member 4 ' •• 6.8 

PHLDA1 Pteckstrtn homology*hke domain, family A, member 1 '. 6,0 

LCN2 Upocallri 2 {oncogene 24p3) 5.7 

RTEl.1 Regulator of telomere elongation heircaso 1 ■ , 5.6 

TGFBI Transforming growth factor, h tndirced ' /-. 5.? 

IGF&P2 ,. ■ lreutin-lik« growth factor binding protein 2 4.8 

TDGF1 Teratocarclnoma-derivcd growth factor 1 4.7 \ 

TWFRSF&B Tumor necrosis factor receptor superfamity, member §b, decoy 4.5 

DMSTl . Defend in malignant brain tumprs 1 .' 4,2 . 

TNF^SFIOC . Tumor necrosis Factor receptor superf ami fy + member 10c, decoy . 4.1 

ANGPTL1 - Angiopoietin-like 1 (Angioarrestin) '. 24.9 - 

CDKN28 Cyclin-dfipfimJem kinase inhibiJor 2B (p15 r inhibits CDK4) 14.9 

GPM6B . Glycoprotem MB& ■ 11.5 

AWK2 Ankyrin 1 9.8 

UNC5C . Unc-5 iJDmologue C 7.4 

HPGD Hydroxypfosiaglandm deHyrlrogenase 15-(NAD) ai 

CPNE8. . CopineViif . 5.5 

. FA1M3 Fas apoptotic inhibitory motfecule 3 • - 5,4 

IL6R fmerleukinrS receptor 4.S . 

TUSC3 Tumor suppressor candidate 3 , 4J 

DU5P1 Dual specificity phosphatase 1 4.7 

RERG RAS'like, ©strogen regulated, growth inbibiior. 4.6 

HON . Necdin '. 4.5 

IGF1 ! Insulin-like growth factor 1 (somatomedin C) 4.0 ." 

. Cell adhesion . 

CDH3. Cadherin 3» type 1, P-cadherin 81,7 

CLDN2 ' Claudiit2 •] 16.1 • . , 

CLDN1 ' Ciau^n 1 ; , d.O 

DSG3 DesmoglBin3 '7,3 ■ 



(Continued on Uie following page) 
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Table 2. Genes Most Likely to be Involved in the Development and Evolution of Colorectal Adenomas (A Subset of Genes 
Listed in in Supplementary Table $4) Subdivided by Gene Ontology Category (Cont'd) .. 



Gene symbol 


Gene name 


" E - 


Fold differences* . 


! 


DSG4 


Dcsmogtelrt 4 


• 5.9. 






CIDN8 


CJawd'ro8 








COH19 


Cadherin 19, type 2 






8.3 


CEACAM7 


Caret noembry cme amigerwelated ceil adhesion molecule 7 






8,3 


CLON23 


■ Ciaudln 23 






8^0 


NRXN1 


Neurexin 1 






7J 


PCDH19 


. P«rtoeadhfc , in 15 






8.8 


NLGN4X 


. WeuroJigin 4. X-linked 






6.0 


TNXB 


TertaKffr XB 






U 


MUCDHL 


.. . Mucin and eadherln-llke . 








PCDH9 


Protoeadherirt 9 






:•■ 4.3 


L1CAM 


Ll cell adhesion molecule 






4.2 . 



'Overexposed (E) or undepressed { I ) In adenomas (versus normai mucosa samples). 



(30-32); decreased expression of the netrin-1 receptor, UNC5C . Other features of K1M1199 expression are also compatible 
(33); and expression changes Involving three Fas apoptosis ; with its putative role as a Wnt target gene, KIAA1199 mRNA 

Inhibitory molecules (FAIM), including FAIM1. which was and protein are both confined to the proliferative compartment 

increased 2.3-fold and is thus not listed in Table 2]; and (e) of normal intestinal crypts, where Wnt signaling Is normally 

marked down- regulation of several genes that would result in active, and they are highly overexpressed in colorectal 

reduced tumor suppression activity [e.g., those encoding the adenomas and carcinomas, where this pathway is almost 

antianglogenic factor ANGPTL1 (34), the cycltn-dependent always aberrantly activated .v^;."" 

kinase inhibitor CDKN2B/p15, and the prostaglandin catabo- In normal and tumor tissues, K1AA1199 is expressed in the 

lism enzyme HPGD (35)J. cytoplasm of epithelial cells. In glands with low-degree 

It is also important to recall the size-related differences noted dysplasia, higher concentrations are observed in the mucin 

in the adenoma gene expression profiles (Fig. 2; Supplementary vacuoles of goblet cells, but cytoplasmic expression of the 

Table S3), When validated in a larger series of tumors, these protein in tumor cells remains elevated even after goblet cell 

differences should provide important clues to the molecular differentiation has been lost (Fig/ 4), These features, together 

basis of the well-known link between the dimensions and with the localization of KIAA1199 in the luminal portion of 

malignant potential of colorectal adenomas (1). the cytoplasm, are suggestive of a secreted and/or membrane 

Our study also famishes a complete picture of expression protein. This conclusion is consistent with our in silico analysis 

changes involving gene components of the Wnt pathway across of KIAA1199 (see Supplementary Data and Supplementary 

the transition from normal to adenomatous epithelium (Supple- Fig. $5), which strongly predicts the presence of a signal 

mentary Table $2) as weir as evidence for the existence of a peptide at its NH 2 -terminal end. In addition, the centra! region 

novel Wnt target; KIAA1199. This gene, which encodes a of KIAA1199 contains a TMEM 2 homology domain, which is 

protein of unknown function, was strikingly overexpressed In all present in several eukaryotic proteins, including TMEM2, 

the adenomas included in this study and in 25 adenocarcirromas^ ^ L (PKHD1L1; Fig, 5), 

of the colon described in a previous report (8). Even more all large receptor proteins characterized by an NHa-terminal 

tntriguingly, its expression was significantly correlated with signal peptide or a single transmembrane helix and a short 

that of several genes that are well-established targets of Wnt cytoplasmic tail (36), 

signaling. Our hypothesis that KIAA1199 is up-regulated by the A study based on yeast two-hybrid screens suggested that 

TCF(s)/h-catenin transcription complex was considerably KIAA11B9 may interact with plexin A2 (KIAA0463; ref. 37). 

strengthened by the marked decreases in KMA1199 expression The transmembrane plexins interact with transmembrane 

observed in cultured colorectal cancer cells when the Writ semaphores on nearby cells, providing "stop" and "go" 

pathway was inhibited by overexpresslon of dominant-negative signals that are crucial for cell motility and invasive growth 

TCF4 proteins or by h-catenin knockdown It is not yet clear (38, 39). KiAAIISS/plexm A2 interaction could thus play 

whether this is a direct effect, but this possibility Is supported by important rotes in colorectal tumorigenesis not only in the 

the results of a recent genome-wide TCF4 ChlP-on-chip invasive stages but also earlier during the formation of 

analysis, which indicatesthat the KIAA1199 locusts surrounded abnormal glands in benign adenomas. - 

by four TCF4-bound regions- 10 These findings are consistent A recent report linked high levels of KIAA1199 mRNAwith 

with the probable role of this gene as a direct target of TCF4/h- cell mortality in human fibroblasts and in a renal cell carcinoma 

catenin signaling In the intestine and in colorectal tumors, cell line (40). In that study/ however, there was no significant 

increase in KIAA1199 expression during replicatlve aging of 

N mortal cells, and this finding contrasts with the documented 
. * Hatzis ei at., unpublished data. . : : behavior of other genes Involved in cell aging (41) . Furthermore, 
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100 



fibroin l 



';• 0.2 : . .• :.\ s ...... :.*.•••./• "■■ ..' "/" •';.;/.; : .; .: . .; /;■ 

FIGURE 5. Phyiogenetic tres of the proteins containing the TMEM2 homology domain found in ihe centra! region of K1AA1199. The tree was generated 
with MEG A3 (52) from the multiple sequence alignment shown In Supplementary Fig. S5, It was calculated with the minimum evolution algorithm and the JTT 
matrix. Positions with gaps were removed for calculation of pairwise distances- Node robustness was assessed using the bootstrap method with 100 
resamplings. (Bootstrap valuBS are shown at Ihe nodes.) Two branches emerged, one comprising KIAA1199 and TMEM2 ami Ihe other with polyductin, 
fibrocyslin L, and several olher THD-canlaining proteins found in the ciliale Tetrahymena thermophila, which were apparently generated in a series of. 
Tetrahymena -specific §ena duplications. The Nonterminal repeats of polyductin and fibrocystln L clustered together, as did the COOH-terminai repeats, 
suggesting that the intragenic duplication of Ihe TH domain in the ancestor of polyductin and fibrocystln I occurred before the divergence of chordates and 
Gchinoderms (more details in Supplementary Data), 
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the authors reported wide variation in K1AA1T99 mRNA 
expression in breast cancer cell lines, and this finding raises 
(he possibility that expression of this gene in vivo and in cell 
fines may differ. 

We believe that our mlcroarray data will serve as a 
springboard and reference point for other studies on the 
molecular basis of colorectal transformation along the adeno- 
ma-carcinoma pathway (and subsequently for the study of 
alternative pathways). Some of the transcriptional changes 
reported in this study might one day be used as molecular 
indices of the susceptibility of adenomas to malignant 
transformation, information that would be helpful in planning 
appropriate fotiow-up of the lesions. As for KIAA1199, its 
invariably high expression in the colorectal tumors we studied 
raises interesting possibilities tor the development of a new 
molecular marker for the detection of these neoplasms* For 
example, because KIAA1199 expression In the normal mucosa 
is limited to ceils in the lower portion of the crypts, which are 
not yet programmed to be shed into the intestinal lumen, the 
presence of K1AA1199 peptides in fecal water might prove to 
be a specific marker of adenomatous lesions, In addition, 
although due consideration must be given to its probable 
physiologic role(s) in intestinal crypts and possibly in several 
; other human tissues (40, 42, 43), KIAA1199 may be a potential 
target of antibody-based therapies. 

Materials and Methods 
Tumor Samples ' 

• Pedunculated colorectal polyps and normal mucosa were 
obtained during colonoscopies carried out in the Gastroenterol- 
ogy Unit of the Belcolle City Hospital (Viterbo, Italy), The 
tissues were collected prospectively with informed patient 
consent and the approval of the local Human Research Ethics 
Committee. Patients with documented familial polyposis, with 
>1 5 adenomatous polyps (total: synchronous + previously excised; 
ref. 44), or currently treated with nonsteroidal anti-Inflammatory 
drugs (including aspirin) were excluded from the study. 

For each polyp/ three biopsies of normal mucosa were 
collected from the same colon segment (z2 cm from the site of 
the polyp). Immediately after removal a small sample of 
epithelial tissue (5-15 mg) was cut from the Hp of each polyp, 
leaving the under lying muscularis mucosae intact. We excluded 
polyps <1 cm to ensure that the sampling procedure would not 
interfere with the histologic diagnosis* All polyp samples were 
collected by a single operator (M.tiLP.) using the same procedure 
to minimize artifacts due to sampling differences. The approach 
. used allowed us to obtain specimens with a high percentage of 
epithelial cells without resorting to microdissection, which can 
diminish the quantity and quality of the extracted RIM A. 

The polyp sample and the three normal mucosal biopsies 
were immersed in RNAlater (Ambion) for subsequent micro- 
array analysis, and the remainder of the polyp was submitted for 
pathologic analysis, The cut surface at the tip was labeled with 
India ink so that the sampled area could be easily Identified 
during routine histologic examination. The tissue was then 
fixed in buffered formalin and embedded in paraffin, DNA 
extracted from sections of this specimen was also used to rule 
out microsatellite instability (reflecting defective DNA mis- 
match repair) at the BAT26 locus, as previously described (45), 



AH of the polyps included in the study met the following 
criteria: type (6), maximum diameter of 1 to 4 cm, absence 
of surface ulceration, histologic diagnosis of adenoma, and 
absence of microsatellite instability at BAT26, 

in some analyses, we also included transcriptomlc data from 
a previously described set of 25 colon cancers (mismatch repair 
proficient and deficient; ref. 8), which we reanalyzed for this 
study with the same microarray used to characterize the 
adenomas and normal mucosa, 

Microarray Analysis/ Real time Reverse Transcription- 
PCR, and Northern Blotting 

Total RNA was extracted (RNeasy Mini kit, Qtagen) from 
homogenized tissue samples (5-15 mg); and its integrity was 
verified by capillary gel electrophoresis {Bio Analyzer, Agilent 
Technologies). Complementary RNA (15 Ag/sampfe), synthe- 
sized and labeled as previously described (8, 46), was 
hybridized with the Affymetrix U133 Plus 2,0 array, which 
contains in situ synthesized oligonucleotides representing the 
entire human genome (54,875 probes). 

Raw gene expression data generated by GeneCh ip Operating 
Software lAffymeifix) were imported into the GeneSpring 
software program (Agilent Technologies) and normalized per 
chip (i,e„ to the median of all values on a given array) and per 
gene (i.e., to the median expression level of the given gene 
across all samples). Analysis was done using the log expression 
values with GeneSprirtg's cross-gene error model turned oa 
Probes were excluded from analysis unless they were listed as 
"present or marginal calls" and/or had expression values zlOO 
in z50% (Z16 of 32) of the samples in at least one of the tissue 
groups (adenomas and normal mucosa). 

Expression data were subjected to four different unsuper- 
vised analyses: (a) hierarchical clustering using the Pearson 
correlation coefficient as a similarity measure and the average 
■ linkage algorithm for branch merging; (b) RCA> which reduces 
the dimensionality (number of variables) of a data set while 
retaining most of its variance £8); (c) correlation analysis, which 
involved computation of Pearson correlation coefficients for all 
possible sample pairs and visualization of correlation values as 
tile plots? and (d) CA, another dimension-reducing method (47), 
which was used to identify samples associated with particular 
gene expression levels/ In typical CA r a matrix of n gene 
expression levels from p samples Is treated as a two-way 
contingency table (genes by samples or vice versa) with n and p 
specifications for the "factors" gene and sample, respectively. 
Each intensity value thus reflects the abundance of a given 
transcript in a given sample. Like PCA, CA Identifies 
independent "factorial components" that account for variance 
within a multidimensional gene data set, but in this case, the 
components are identified and ranked according to the 
correlation between gene and sample scores, A supervised or 
constrained extension of CA (9), CCA, was then used to identify 
possible correlations between gene expression patterns and 
clinical or pathologic variables. CA and CCA, as well as the 
corresponding plots, were computed using R software and the 
ade4 and made4 packages furnished by Bfoconductor, 11 
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The Mann- Whitney test was used to seiecl genes differentially 
expressed in normal mucosa and adenomas; Bergamini- - 
Hochberg multiple tesllrvg correction was apptted with a false 
discovery rate of 0,01. The genes in this set that were 
differentially expressed with fold differences of z2.G were 
Iftarv analyzed with ErmineJ software (48) to identify any 
biologicaf processes from the Gene Ontology database (49) that 
were overrepresenled. 

Pearson correlation was used to identify correlation between 
KIAA1199 expression and the expression of other genes in the 
entire set of tissue samples. Fisher's exact test was used to 
identify possible overrepresemation of known Wnt targets 
among genes whose expression was closely correlated with that 
of KIAA1199 (correlation values z0.8). 

Reverse transcription-PCR and Northern blotting were done 
as previously described (46 t 60) to verify the expression level 
of KIAA1199 in tissue samples and in LS174T colon cancer 
cells in which inducible Inhibition of the Wnt pathway had been 
achieved with previously described methods (14-16). 

\x\ s\lu Hybridization 

Digoxigenin-fabeled K1AA1199 antisense rlboprobes were 
synthesized from a PCR product amplified from human colon 
cDMA with K1AA1 199 -specific primers (sense: SKacatcggg- 
gaggagatega~3|; antisense r containing a T7 RNA polymerase- 
binding site: 5l-taatacg3CtcactalagggtlccagacUgaca-3)). This 
product was transcribed in vitro using the DIG RNA labeling 
kit and T7 RNA polymerase (Roche Diagnostics). In situ 
hybridizations. were done on paraffin-embedded sections of 
human colon fixed with 4% buffered formalin as described 
elsewhere (51), 

Immunohistochemistry 

Our in silico analysis of KIAA1 1 99 (see Supplementary Data) 
Indicated that residues 202 to 217 (IHSDRFQTYRSKKESE) 
form a loop between a conserved h«strand and the following 
helix of the Nonterminal GG domain. This charged, surface- 
exposed peptide was used to raise a rabbit polyclonal antibody, 
which was purified by affinity chromatography on Thioprbpyl 
Sepharose 6B (Amersham) derivatized with the antigenic 
peptide. A 1:1,000 dilution of this antibody was used, as 
previously described (45), to evaluate KIAA1 199 expression in 
formalin-fixed, paraffin-embedded sections of adenoma and 
normal mucosal tissues, • 
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